Loading a CSV file (e.g. comma or tab. separated) into R is straightforward. We can load the reference data and the PHAZIR data as follows:

setwd("../_Rdata")    #Replace this by the directory where you stored the files
refData<-read.csv("refData.csv", sep=",", dec=".", header=TRUE)    #Load CSV file containing reference data
spectra<-read.csv("PHAZIRspectra.csv",sep=",",dec=".",header=TRUE)  #Load CSV file containing PHAZIR spectra


The reference data are composed of 7 columns containing the parameters measured in the laboratory for each sample:

  • Step = Sample drying step number
  • SpeciesID = wood (tree) species identification (A for Aspen and P for Poplar)
  • sampleID (self explanatory)
  • MC = Moisture content of each sample
  • ASDref = Reference number for spectra acquired with the ASD
  • PHAZIRref = Reference number for spectra acquired with the PHAZIR
head(refData)
##   Step SpeciesID SampleID ASDref PHAZIRref   MC
## 1    1         A       55    392         1 79.0
## 2    1         A       53    212         2 44.9
## 3    1         A        2    206         3 65.5
## 4    1         A       37    542         4 55.2
## 5    1         A       84    512         5 74.8
## 6    1         A       21    410         6 43.6


The first column of the PHAZIR data frame contains the reference numbers of the spectra acquired with the PHAZIR. The rest of the data frame is the spectral matrix. Each row represents a spectrum and each column represents a wavelength (from 939.5 nm to 1796.6 nm by increment of ~ 9 nm). The values represent the absorbance (measured as log(1/R)) of the NIR “light” at each wavelength.

spectra[1:5,1:5]
##   PHAZIRref    X939.5    X948.6    X957.7    X966.9
## 1         1 0.2319877 0.2435677 0.2610752 0.2734112
## 2         2 0.2026727 0.2104515 0.2232337 0.2318915
## 3         3 0.1863593 0.1965210 0.2110100 0.2222735
## 4         4 0.2181995 0.2296227 0.2465842 0.2590197
## 5         5 0.2778210 0.2916717 0.3109255 0.3257660


The reference data can be matched with the PHAZIR spectra based on spectra reference numbers (column “PHAZIRref” common to the two data sets). One way to do this:

#Sort the reference data by increasing order of reference number:
refData<-refData[with(refData, order(PHAZIRref)),] 

#Verify that the reference number of refData and spectra are identical:
all(refData$PHAZIRref == spectra$PHAZIRref)
## [1] TRUE
#Combine the two in a new data frame:
mydata<-data.frame(refData, NIR = I(spectra[2:ncol(spectra)]))

#Save the data in a Rdata file:
save(mydata, file = "mydataPHAZIR.Rdata")


The new data frame is saved under the name mydataPHAZIR.Rdata.