• ISSN 1008-505X
  • CN 11-3996/S

优化光谱指数助力机器学习提高马铃薯叶绿素含量反演精度

Machine learning models fed with optimized spectral indices to improve inversion accuracy of potato chlorophyll content

  • 摘要:
    目的 基于光谱指数的遥感估测方法因计算简单而被用于作物生长实时监测,但在作物生育前期易受土壤背景的影响而后期容易失去灵敏性。随着人工智能的快速发展,利用机器学习算法对叶绿素进行估测成为提高遥感监测精度的普遍接受的方法。然而机器学习算法种类较多,且估测结果与输入变量有很大的关系,本研究选用最常用的偏最小二乘法和随机森林算法,比较不同投入变量对这两个算法马铃薯叶绿素含量估测精度的影响。
    方法 2019—2020年,在内蒙古马铃薯主产区进行不同氮素水平田间试验,在马铃薯关键生育时期获取光谱数据和叶绿素值,利用试验数据对光谱指数进行波段优化,寻找叶绿素敏感的优化光谱指数。分别利用全反射光谱波段和优化光谱指数作为输入变量,代入随机森林和偏最小二乘法模型对马铃薯叶绿素含量进行估测,将试验数据分为75%的建模集和25%的验证集,比较模型的精确度,同时对模型进行评价。
    结果 光谱指数优化波段主要集中在紫光和绿光范围,基于中心波段408和552 nm的光谱指数Opt-NDVI优化效果最好。通过优化,光谱指数与马铃薯叶绿素含量的相关性显著提高,但是相关性受生育时期影响较大,马铃薯花后生育时期其相关性高于花前生育时期。利用随机森林和偏最小二乘法建模结果表明,与优化光谱指数相比,机器学习模型显著提高了对马铃薯叶绿素含量的估测能力,随机森林和偏最小二乘法模型都具有较高的估测精度。通过验证集数据和PROSAIL物理模型的验证和评价证明,基于优化光谱指数的随机森林模型具有更好的建模能力,实测值与估测值具有良好的线性拟合,且验证效果最好,同时克服了生育时期的影响。
    结论 光谱指数的预测能力受生育时期的影响较大,优化光谱指数作为输入变量不仅可有效减少输入变量数量从而提高随机森林算法和偏最小二乘法的计算效率,而且可使预测精度有所提高。基于优化光谱指数的随机森林算法对马铃薯叶绿素含量的估测精度最高,在各生育时期都具有较好的估测能力,克服了生育时期的影响,利用该方法可对马铃薯叶绿素含量进行估测。

     

    Abstract:
    Objectives Remote sensing method, based on spectral index, is widely used for the real-time monitoring of crop growth, however, the soil background at the early stage and the loss of sensitivity at the later stage of crop growth limit its accuracy. With the rapid development of artificial intelligence, using machine learning algorithms has become a widely accepted method to remove the defect. In this study, we compared the accuracy of potato chlorophyll content estimation by partial least squares and random forest algorithm methods.
    Methods Field experiments with different nitrogen levels were conducted in the main potato producing areas of Inner Mongolia from 2019 to 2020. Spectral data and chlorophyll values were obtained at the key growth period of potatoes. Spectral index band optimization was conducted using the experimental data to find the optimal spectral index sensitive to chlorophyll. Total reflection spectral band and optimized spectral index were used as input variables to estimate potato chlorophyll content in random forest and partial least squares model. The experimental data were divided into 75% and 25% as modeling set and verification set, respectively, to compare the accuracy of the models and evaluate the models at the same time.
    Results The optimized bands were mainly concentrated in the purple and green light ranges, Opt-NDVI based on the central band of 408 nm and 552 nm had the best optimization effect. The correlation between the spectral index and potato chlorophyll content was significantly improved through optimization, but still influenced by the growth period. The correlation was higher in the post-flowering period than in the pre-flowering period. Random forest and partial least squares were used for modeling with the total reflection spectral band and optimized spectral index, respectively. Compared with the optimized spectral index, both the machine learning models significantly improved the accuracies of potato chlorophyll estimation by using the optimized spectral index. Under the same variables, the stochastic forest model showed was better in modeling ability, the estimated values showed better linear relationship with the measured values, and the influence of the growth period was neglected.
    Conclusions The prediction ability of spectral index is greatly affected by the growth period. The optimization of spectral index as the input variables improves the computational efficiency and prediction accuracy of random forest algorithm and partial least square method. The random forest algorithm based on optimized spectral indices is more accurate than the partial least square for estimation of potato chlorophyll, with better linear relationship between the predicted and measured values and negligible growth period impaction.

     

/

返回文章
返回