欢迎访问林业科学,今天是

林业科学 ›› 2021, Vol. 57 ›› Issue (10): 93-101.doi: 10.11707/j.1001-7488.20211009

• 论文与研究报告 • 上一篇    下一篇

钻蛀性害虫取食声音的人工智能早期识别

刘璇昕1,4,孙钰1,2,崔剑2,蒋琦3,陈志泊1,4,*,骆有庆3   

  1. 1. 北京林业大学信息学院 北京 100083
    2. 北京航空航天大学网络空间安全学院 北京 100191
    3. 北京林业大学林学院 北京 100083
    4. 国家林业和草原局林业智能信息处理工程技术研究中心 北京 100083
  • 收稿日期:2020-01-13 出版日期:2021-10-25 发布日期:2021-12-11
  • 通讯作者: 陈志泊
  • 基金资助:
    北京林业大学建设世界一流学科和特色发展引导专项资金(2019XKJS0310);北京市科技计划"北京生态公益林重大有害生物防控关键技术"(Z191100008519004);国家重点研发计划"人工林重大灾害防控关键技术研究"(2018YFD0600200)

Early Recognition of Feeding Sound of Trunk Borers Based on Artificial Intelligence

Xuanxin Liu1,4,Yu Sun1,2,Jian Cui2,Qi Jiang3,Zhibo Chen1,4,*,Youqing Luo3   

  1. 1. School of Information Science and Technology, Beijing Forestry University Beijing 100083
    2. School of Cyber Science and Technology, Beihang University Beijing 100191
    3. College of Forestry, Beijing Forestry University Beijing 100083
    4. Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration Beijing 100083
  • Received:2020-01-13 Online:2021-10-25 Published:2021-12-11
  • Contact: Zhibo Chen

摘要:

目的: 以双条杉天牛为研究对象,设计基于卷积神经网络的识别模型识别其取食声,并对模型的抗噪性能进行测试,以期实现蛀干害虫的早期预警。方法: 使用SP-1 L探头连接NI 9215电压采集卡采集双条杉天牛咬食木段的声音和典型户外环境下的噪声并以音频格式保存。研究选取部分噪声作为加噪音频,以-3 dB至3 dB的信噪比向双条杉天牛取食声中混入环境噪声,生成训练数据和简单测试集,然后经过短时傅里叶变换、对数计算、平均池化3步操作计算音频的平均对数谱,分别设计并训练基于卷积神经网络的识别模型和传统高斯混合模型,提取音频特征判断音频是否为双条杉天牛取食声。为进一步测试模型的抗噪性能,利用独立划分的加噪音频以-7~3 dB的信噪比向双条杉天牛取食声中混入噪声生成抗噪测试集,对卷积神经网络和传统高斯混合模型的抗噪性能进行测试。结果: 在简单测试集上,基于卷积神经网络的识别模型的识别准确率为98.80%,较高斯混合模型有0.88%的下降。在抗噪测试集上,基于卷积神经网络的识别模型识别双条杉天牛取食声的整体准确率为97.37%,较高斯混合模型提高6.76%,其中,信噪比为-3 dB时,识别准确率为98.13%,较高斯混合模型提高9.80%,信噪比为-6 dB时,识别准确率为92.13%,较高斯混合模型提高5.67%。结论: 卷积神经网络能有效综合频谱特征,准确判断音频中有无双条杉天牛的取食声,同时,相比高斯混合模型,卷积神经网络具有良好的泛化能力,在低信噪比下仍能保证较高的识别准确率。基于卷积神经网络的取食声识别模型能够适应林木蛀干害虫的野外监测环境,可为隐蔽蛀干害虫的自动化监测和早期预警提供技术支撑。

关键词: 蛀干害虫, 取食声音, 卷积神经网络, 早期识别, 抗噪性

Abstract:

Objective: Among forest pests, tree trunk borers have hidden life and are difficult to control, thus they are a major hidden danger of ecological security. In this study, Semanotus bifasciatus was selected as the research object, and a recognition model was designed based on the convolutional neural network to recognize the feeding sounds, and the noise immunity of the model was tested in order to realize the early warning for the tree trunk borers. Method: In this study, the SP-1 L probe was connected with NI 9215 voltage collection module to collect the feeding sounds of S. bifasciatus and the noise in typical outdoor environment, and the sounds were saved as audio format. Part of the noise was selected as the noise-added audios, and the feeding sound of S. bifasciatus was mixed with the environmental noise with the signal-noise ratio from -3 dB to 3 dB to produce the training data set and the simple test set. Then the average log spectrums of the audios were calculated as the input of the model through the three steps of short-time Fourier transform, logarithm calculation and the average pooling. The proposed recognition model based on the convolutional neural network and the traditional Gaussian mixture model was used to extract the features of the spectrums and judge whether the audio was the feeding sounds of S. bifasciatus. In order to further test the noise immunity of the model, this study used the independent noise-added audios to mix the feeding sounds of S. bifasciatus with the signal-noise ratios from -7 dB to 3 dB, which were wider compared with the training set. Then the noise immunity of the convolutional neural network and the traditional Gaussian mixture model were tested. Result: On the simple test set, the recognition accuracy of the recognition model based on the convolutional neural network was 98.80%, which was 0.88% lower than that of the Gaussian mixture model. On the noise immunity test set, the overall accuracy of the recognition model based on convolution neural network to recognize the feeding sounds of S. bifasciatus was 97.37%, which was 6.76% higher than that of the Gaussian mixture model. What's more, the recognition accuracy at -3 dB signal-noise ratio of the recognition model based on the convolutional neural network was 98.13%, which was 9.80% higher than that of the Gaussian mixture model, and the recognition accuracy at -6 dB signal-noise ratio of the recognition model based on the convolutional neural network was 92.13%, which was 5.67% higher than that of the Gaussian mixture model. Conclusion: The results demonstrate that the convolutional neural network can effectively synthesize the audio spectrum features and accurately judge whether there is the feeding sound of S. bifasciatus. At the same time, the convolutional neural network has better generalization ability, and can ensure the high recognition accuracy even under low signal-noise ratio. Therefore, the feeding sounds recognition model based on the convolutional neural network can adapt to the field monitoring environment of tree trunk borers, and can provide technical support for the automatic monitoring and early warning of the stealthy tree trunk borers.

Key words: trunk borers, feeding sounds, convolutional neural network, early recognition, noise immunity

中图分类号: