Chemometric pre-processing can negatively affect the performance of near-infrared spectroscopy models for fruit quality prediction

The objectives of this study are to test two hypotheses to explore the effect of pre-processing on NIR spectra of fresh fruit. The first hypothesis is that the pre-processing of NIR spectra with scatter correction techniques can reduce the predictive performance of models as the scatter correction can reduce the useful scattering information correlated to the property of interest. The second hypothesis is that the Deep Learning (DL) can model the raw absorbance data (mix of scattering and absorption) much more efficiently than the Partial Least Squares (PLS) regression analysis. To test the hypotheses, a real NIR data set related to dry matter (DM) prediction in mango fruit was used. The dataset consisted of a total of 11,420 NIR spectra and reference DM measurements for model training and independent testing. The chemometric pre-processing methods explored were standard normal variate (SNV), variable sorting for normalization (VSN), Savitzky-Golay based 2nd derivative and their combinations. Further two modelling approaches i.e., PLS regression and DL were used to evaluate the effect of pre-processing. The results showed that the best root mean squared error of prediction (RMSEP) for both the PLS and DL models were obtained with the raw absorbance data. The spectral pre-processing in general decreased the performance of both the PLS and DL models. Further, the DL model attained the lowest RMSEP of 0.76%, which was 13% lower compared to the PLS regression on the raw absorbanc...
Source: Talanta - Category: Chemistry Authors: Source Type: research