欢迎访问林业科学,今天是

林业科学 ›› 2016, Vol. 52 ›› Issue (1): 89-98.doi: 10.11707/j.1001-7488.20160111

• 论文与研究报告 • 上一篇    下一篇

基于气象因子的随机森林算法在塔河地区林火预测中的应用

梁慧玲1,2, 林玉蕊2, 杨光3, 苏漳文1, 王文辉1, 郭福涛1   

  1. 1. 福建农林大学林学院 福州 350002;
    2. 福建农林大学计算机与信息学院 福州 350002;
    3. 东北林业大学林学院 哈尔滨 150040
  • 收稿日期:2015-01-14 修回日期:2015-06-24 出版日期:2016-01-25 发布日期:2016-02-26
  • 通讯作者: 郭福涛
  • 基金资助:
    福建省自然科学基金项目(2015J05049);福建农林大学校重点项目建设专项(6112C035K)。

Application of Random Forest Algorithm on the Forest Fire Prediction in Tahe Area Based on Meteorological Factors

Liang Huiling1,2, Lin Yurui2, Yang Guang3, Su Zhangwen1, Wang Wenhui1, Guo Futao1   

  1. 1. College of Forestry, Fujian Agriculture and Forestry University Fuzhou 350002;
    2. College of Computer and Information Science, Fujian Agriculture and Forestry University Fuzhou 350002;
    3. College of Forestry, Northeast Forestry University Harbin 150040
  • Received:2015-01-14 Revised:2015-06-24 Online:2016-01-25 Published:2016-02-26

摘要: [目的] 应用逻辑斯蒂回归模型和随机森林算法建立大兴安岭塔河地区林火发生的预测模型并对比模型预测精度,判断随机森林算法在该地区林火预测中的适应性,为该地区林火管理工作提供技术支持。[方法] 利用1974-2008年大兴安岭塔河地区森林火灾发生数据,分别运用二项逻辑斯蒂回归模型和随机森林算法,对塔河地区林火发生与气象因子之间的关系进行实证分析。为减少训练样本分布对试验结果的影响,将全样本数据随机分成60%的训练样本和40%的测试样本,并且进行5次重复,建立5个中间模型(样本组)。选择在5个中间模型中的3个及以上的显著变量(因子)对全样本数据进行分析并分别比较2种模型算法在5个中间模型和全样本模型中的预测准确率。此外,还设计了变量交互试验进一步验证相同变量下2种模型的预测精度。[结果] 日最小相对湿度、细小可燃物湿度码和干旱码3个因子在二项逻辑斯蒂回归模型和随机森林算法中均与林火发生呈显著相关。模型拟合的预测结果显示:在对5个中间模型的预测中,随机森林算法对训练样本(60%)和测试样本(40%)的预测准确率分别高于二项逻辑斯蒂回归模型8%和10%左右;在全样本模型的预测中,随机森林算法拟合的准确率为85.0%,而二项逻辑斯蒂回归模型拟合的准确率为76.2%,二者相差10%左右,与之前5个中间模型的预测结果一致;在变量交互试验中,随机森林算法拟合的准确率为86.0%,而二项逻辑斯蒂回归模型拟合的准确率为72.8%,随机森林算法的预测准确率提高了18.1%左右。[结论] 日最小相对湿度、细小可燃物湿度码和干旱码是影响林火发生的主要气象因子。在基于气象因子的塔河地区林火发生预测模型研究中,随机森林算法的预测准确率高于传统二项逻辑斯蒂回归模型10%左右,具有一定的预测优势和应用价值,可为大兴安岭塔河地区林火预测和决策提供参考。

关键词: 塔河地区, 林火发生, 气象因子, 随机森林算法, 逻辑斯蒂回归

Abstract: [Objective] In this study, two methods were applied to establish fire prediction model for Tahe, Daxing'an Mountains. Our objective is to identify the applicability of random forest algorithm to local forest fire prediction according to prediction accuracy comparison. This study would provide some technical support for local forest fire management. [Method] The fire data collected in Tahe, Daxing'an Mountains between 1974 and 2008 were used in a case study to identify the relationship between fire occurrence and meteorological factors by using logistic regression (LR) model and random forest (RF) algorithm, respectively. In order to reduce the influence of sample distribution on the model fitting, the original dataset was randomly divided into training (60%) and validation (40%) samples. The procedure was repeated five times applying a sampling with replacement method, thus obtaining five random sub-samples (sample groups) of the data, each with a training and validation dataset. The predictors that had been proved to be significant at ɑ=0.05 in at least three of five intermediate models were included in the final models. Besides, in the present study a "cross validation" test was to identify the accuracy of the two models. [Result] The results of model parameter estimation indicated that daily minimum relative humidity, fine fuel moisture content (FFMC) and drought code (DC) were identified as important predictors in both Logistic and Random Forest model. The result of model fitting revealed that the prediction accuracy of LR model in five intermediate models were 8% and 10% lower than that of RF,respectively, for the training and variation samples. However, the prediction accuracy of RF on the complete dataset was 15% higher than that of LR. In the Cross Validation test, the prediction accuracy of RF was 85.0%, higher than that of LR (76.2%) and the result agreed with that of five sample groups. [Conclusion] Our results revealed that the RF model was superior to LR model on the fire prediction in the study area, thus the RF model can be used in the fire prediction and provide important information for the local fire management and plan.

Key words: Tahe area, fire occurrence, meteorological factors, random forest algorithm, Logistic regression

中图分类号: