Welcome to visit Scientia Silvae Sinicae,Today is

Scientia Silvae Sinicae ›› 2026, Vol. 62 ›› Issue (2): 111-125.doi: 10.11707/j.1001-7488.LYKX20240779

• Research papers • Previous Articles    

Construction of Near Infrared Spectroscopy Prediction Models Based on CARS-PLSR for Determining Oil Content and Fatty Acid Composition of Camellia oleifera Kernel

Huiqi Zhong1,2,Jingyu Chai1,Kailiang Wang1,3,Jianhua Teng4,Wenyu Bi5,Anni Wang1,6,Ping Lin1,3,*()   

  1. 1. Research Institute of Subtropical Forestry, Chinese Academy of Forestry Fuyang 311400
    2. Graduate School of Nanjing Forestry University Nanjing 210037
    3. Zhejiang Key Laboratory of Forest Genetics and Breeding Fuyang 311400
    4. Dongfanghong Forest Farm of Zhejiang Province Jinhua 321025
    5. Shengzhou Forestry Technology Service Center Shengzhou 312400
    6. State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University Harbin 150040
  • Received:2024-12-18 Revised:2025-11-05 Online:2026-02-25 Published:2026-03-04
  • Contact: Ping Lin E-mail:linping80@126.com

Abstract:

Objective: This study aims to develop a low-cost, non-destructive, accurate, and batch method for detecting the oil content and fatty acid composition of Camellia oleifera kernels, to improve the evaluation efficiency of the oil traits. Method: The oil content in kernels of 220 C. oleifera clones was determined by Soxhlet extraction, and the fatty acid composition was determined by gas chromatography, respectively. The near infrared spectra of the kernels in the wavelength range of 1000?2500 nm were collected. After preprocessing the spectral data using 9 methods, the samples were divided into calibration and prediction sets at a ratio of 4:1 by random sampling (RS) and sample set partitioning based on joint X-Y distance (SPXY), respectively. The competitive adaptive reweighted sampling (CARS) was used to select the key wavelengths that were significantly correlated with the oil traits of C. oleifera from the spectral data, and the partial least squares regression (PLSR) prediction models were established for determining the oil content and fatty acid composition of C. oleifera kernels. Result: The variation ranges of oil content and the content of seven fatty acids (C16:0, C16:1, C18:0, C18:1, C18:2, C18:3, C20:1) were in accordance with or close to normal distribution. The established models for predicting oil content had good accuracy and stability. With the RS samples dividing method, the pretreatment method of standard normal variate (SNV) was optimal. With 14 key wavelengths selected, a prediction model of oil content was established with the relative percent deviation (RPD) of 5.2055, prediction set determination coefficient ($R_{\mathrm{p}}^2 $) and root mean square error (RMSEp) of 0.965 1 and 1.854 8 g·(100 g)?1, respectively. With the SPXY samples dividing method, the optimal SNV + first derivative (FD) pretreatment, and 25 key wavelengths selected, another prediction model of oil content was established with a RPD of 3.417 0, prediction set $R_{\mathrm{p}}^2 $ and RMSEp of 0.916 8 and 2.622 4 g·(100 g)?1, respectively. The models for C18:1, C18:2 and C18:3 contents were optimal under the RS method using second derivative (SD), SNV and continuum removal (CR) pretreatment methods, respectively, with RPD values of 1.939 4, 2.116 4 and 2.338 1, $R_{\mathrm{p}}^2 $ values of 0.738 5, 0.775 4 and 0.831 6, and RMSEp values of 1.707 1%, 1.370 2% and 0.049 2%, respectively. Conclusion: The prediction model for oil content of C. oleifera kernels has been constructed based on near-infrared spectroscopy in this study. This model has high accuracy and good stability, and can be used for rapid, batch and non-destructive detection of oil content of C. oleifera kernels. The prediction models for C18:1, C18:2 and C18:3 contents can be used for preliminary prediction of unsaturated fatty acid. This study can provide scientific basis for rapid detection of oil content, fatty acid composition and other traits of C. oleifera by near-infrared spectroscopy technology.

Key words: Camellia oleifera kernel, oil content, fatty acid, near infrared spectroscopy, prediction model

CLC Number: