欢迎访问林业科学,今天是

林业科学 ›› 2021, Vol. 57 ›› Issue (3): 51-66.doi: 10.11707/j.1001-7488.20210306

• 论文与研究报告 • 上一篇    下一篇

多源数据对林分动态预测的影响及不确定性分析

田相林1,2,廖梓延3,4,孙帅超5,薛海连6,王彬7,曹田健1,*   

  1. 1. 西北农林科技大学林学院 生态仿真优化实验室 杨凌 712100
    2. 赫尔辛基大学林学系 赫尔辛基 FI-00014
    3. 中国科学院成都生物研究所 成都 610041
    4. 中国科学院大学 北京 100039
    5. 福建农林大学 福州 350002
    6. 西北农林科技大学理学院 杨凌 712100
    7. 青海大学农林科学院 西宁 810016
  • 收稿日期:2019-04-29 出版日期:2021-03-25 发布日期:2021-04-07
  • 通讯作者: 曹田健
  • 基金资助:
    国家自然科学基金面上项目(31670646);全国森林经营样板基地科技支撑专项(1692016-07)

Impacts of Multiple Source Data on Forest Forecasting and Uncertainty Propagation

Xianglin Tian1,2,Ziyan Liao3,4,Shuaichao Sun5,Hailian Xue6,Bin Wang7,Tianjian Cao1,*   

  1. 1. Simulation Optimization Laboratory College of Forestry, Northwest A & F University Yangling 712100
    2. Department of Forest Sciences, University of Helsinki Helsinki FI-00014
    3. Chengdu Institute of Biology, Chinese Academy of Sciences Chengdu 610041
    4. University of Chinese Academy of Sciences Beijing 100039
    5. Fujian Agriculture and Forestry University Fuzhou 350002
    6. College of Science, Northwest A & F University Yangling 712100
    7. Academy of Agriculture and Forestry Sciences, Qinghai University Xining 810016
  • Received:2019-04-29 Online:2021-03-25 Published:2021-04-07
  • Contact: Tianjian Cao

摘要:

目的: 比较多源数据对林分动态预测的影响,分析模型参数与预测不确定性的变化规律,从准确性和可靠性角度对模型进行评估,获取改进模型的数据需求,为森林调查中的数据收集策略提供建议。方法: 收集秦岭油松林3期调查(1990、2005和2012年)和4种信息类型(临时样地、固定样地、解析木和多源数据)建模数据,设计一组数据信息要求较低的可变密度全林模型,基于贝叶斯信息动态融合框架,分析传统森林调查数据与生长收获模型的关系。利用MCMC抽样技术获得的参数联合后验分布对森林动态模拟的不确定性进行量化:一方面比较相同类型的多期森林调查数据不断对模型进行训练后,模型在参数与预测中的概率分布变化过程;另一方面比较分别采用4种数据类型对模型预测产生的影响。数据与模型更新循环过程以先验信息和后验信息不断相互转化的方法实现,即前一次拟合得到的参数联合后验分布作为下一期数据加入时的先验。不同数据类型整合根据数据自身抽样和观测误差所设计的独立似然结构实现。为避免粗糙数据或异常值对模型产生的影响,描述误差分布的似然函数采用重尾正态分布。观测误差的异方差特性通过迭代中自动调整似然函数的方差控制。结果: 随着新一期调查数据加入,模型参数的边际或联合分布不断发生变化,但概率分布峰度总是逐渐升高,即参数不确定性逐步下降,从而降低林分预测的不确定性。与基于1990年调查数据的模型相比,经过2005和2012年数据校正后模型在成过熟林阶段的不确定性下降最为明显,同时树高生长极大值的参数也更高。不同数据类型在模型预测中的差异反映出不同调查方法本身的缺陷和优势,解析木数据倾向于在成过熟林阶段预测出更高的树高生长;固定样地和临时样地数据在林分平均高和平均胸径模拟中表现相似,但由于抽样方法和数据量等因素区别,导致其在林分断面积模拟中呈现明显差异。基于循环更新或多源数据的模型呈现出最稳定的预测结果。结论: 在生长收获模型构建中,不同类型森林调查数据会产生不同预测结果,不同数据信息特性也会对预测的不确定性产生规律性影响。以概率分布呈现信息的贝叶斯方法,既可反映模型的精准程度,又能解释数据信息中存在的缺陷。本研究以全林模型更新为例,展示了该方法不断循环、更新、融合的数据-模型逻辑框架,是架构生长收获模型与数据桥梁的有力工具。

关键词: 贝叶斯分析, 生长收获, 模型更新, 多源数据, 不确定性量化

Abstract:

Objective: This study was carried out to compare different effects of multiple source data on forest dynamic forecasting. The patterns of parametric uncertainty and predictive uncertainty were analysed and quantified to illustrate the processes of information fusion. Changes in model accuracy and reliability were also assessed to reveal the differences in the characteristics of data, which also was expected to provide directions for further data collection. Method: Multi-period (i.e. 1990, 2005 and 2012) and multi-type (i.e. temporary plots, permanent plots and stem analysis) of inventory data of Pinus tabulaeformis were collected in Qinling Mountains. A simple variable-density stand-level model with a low data requirement was selected. Under a Bayesian framework of information fusion, we analysed the relations between traditional forest inventory data and empirical growth and yield models. The joint posterior parametric distributions were constructed using MCMC sampling technique in order to quantify the uncertainty in the forecasts of forest dynamics. On the one hand, the changes in the probability distributions of both parameters and predictions were compared for multi-period inventory data; on the other hand, the multi-type data were tested considering their impacts on model performances. The data-model updating loop was achieved by the relation between the priori and the posteriori, which meant that the joint posterior parametric distribution in the former experiment was continuously used as the prior information for the latter experiment. The integration of multiple source data was based on the assumptions of the independent likelihood for sampling and observing error in each dataset. To avoid the biases from erratic observations and outliers, the likelihood of error structure applied a heavy-tailed normal distribution. The heteroscedasticity of errors was considered using an automatically changing variance in likelihood during iterations. Result: With the new dataset continuously obtained, the marginal and joint parametric distribution kept changing. In general, the parametric uncertainty decreased along with the increase of the kurtosis in the probability distribution, resulting in a decreasing predictive uncertainty. In comparison with parameterization from inventory in 1990, the model calibrated with data from 2005 and 2012 showed an obvious lower predictive uncertainty during the mature stage, while the asymptotic parameter was shifted to higher values. The distinctions of predictions among various datasets revealed the advantages and drawbacks of different inventory datasets. The information from stem analysis tended to a higher prediction of average height for mature stand, when compared with plot sampling. The temporary plots and permanent plots differed in the sampling method and the observation quantity, which made the forecasts of stand basal area present distinctively. The model based on continuously updating and multi-source data performed the highest precision and accuracy. Conclusion: One challenge of forest growth and yield modeling is that sampling and observing errors vary with datasets. Even with the same set of optimal parameters, the advantages and drawbacks in different datasets might lead to a distinctive pattern of uncertainty. The probabilistic information could demonstrate both the accuracy of models and the lacking information of data, which would reveal the further direction of model development and data collection. The case study chose a specific Bayesian approach to demonstrate the complete logic of data-model loop and processes of information fusion.

Key words: Bayesian analysis, growth and yield, model update, multi-source data, uncertainty quantification

中图分类号: