欢迎访问林业科学,今天是

林业科学 ›› 2023, Vol. 59 ›› Issue (8): 112-122.doi: 10.11707/j.1001-7488.LYKX20220378

• • 上一篇    下一篇

基于BS-ResNeXt-50的密云地区野生动物图像识别

齐建东1,2,马鐘添1,张德怀3,田赟4   

  1. 1. 北京林业大学信息学院 北京 100083
    2. 国家林业和草原局林业智能信息处理工程技术研究中心 北京 100083
    3. 北京雾灵山保护区管理处 北京 101506
    4. 北京林业大学水土保持学院 北京 10083
  • 收稿日期:2022-05-31 出版日期:2023-08-25 发布日期:2023-10-16
  • 基金资助:
    国家重点研发计划项目 “典型人工林生态系统对全球变化适应机制”(2020YFA0608100)

Wildlife Image Recognition in Miyun District Based on BS-ResNeXt-50

Jiandong Qi1,2,Zhongtian Ma1,Dehuai Zhang3,Yun Tian4   

  1. 1. College of Information, Beijing Forestry University Beijing 100083
    2. Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration Beijing 100083
    3. Management Office of Wulingshan Mountain Nature Reserve Beijing 101506
    4. School of Soil and Water Conservation, Beijing Forestry University Beijing 100083
  • Received:2022-05-31 Online:2023-08-25 Published:2023-10-16

摘要:

目的: 以卷积神经网络为基础,对现有的网络结构进行改进,实现对红外相机拍摄的野生动物图像进行种类自动识别。方法: 构建从北京市密云区的北京市级雾灵山自然保护区2014—2015年期间采集到的8类2172张野生动物图像数据集,并使用随机增强策略从14个增强方案中随机选择增强方式,为图像数据添加噪声。使用SENet及BlurPool构建了基于ResNeXt-50的改进网络:增强特征提取的SE-ResNeXt-50、维持平移不变性的BP-ResNeXt-50、结合二者的BS-ResNeXt-50,并在自建数据集上测试了不同固定学习率、分段学习率及余弦退火学习率对BS-ResNeXt-50网络准确率的影响。使用VGG16、ResNeXt-50、EfficientNet-B0、InceptionV3、DenseNet-121、BS-ResNeXt-50网络在CCT公开野生动物数据集中常见的16个类别图像上进行训练,并对单一物种的识别准确率进行比较。结果: SE-ResNeXt-50和BP-ResNeXt-50准确率分别达到了75.16%±0.14%和73.74%±0.13%。融合SENet以及BlurPool的改进方案BS-ResNeXt-50在自建数据集上测试的准确率达到78.04%±0.11%,为最优改进方案。使用余弦退火学习率后,BS-ResNeXt-50的准确率提升至81.54%,比固定学习率提升了3.5%;分段学习率准确率达到79.3%,与余弦退火学习率相差2.24%。在CCT数据集中BS-ResNeXt-50的识别准确率可达95.07%,比ResNeXt-50准确率高出1.95%,同时也高于VGG16的85.5%、EfficientNet-B0的90.23%、InceptionV3的91.38%以及DenseNet-121的93.3%准确率,并在各单一类别的预测准确率也均高于上述模型。单一类别的识别中除数量最少的类别外,BS-ResNeXt-50在其他类别识别准确率均高于90%,最高类别准确率达到98.6%。结论: 改进后的BS-ResNeXt-50模型相比ResNeXt-50可以更准确地完成对野生动物图像的识别任务,在不同的野生动物图像数据集上也具有较好的泛化能力。

关键词: 野生动物图像, 物种识别, 深度学习, 卷积神经网络

Abstract:

Objective: In the wild environment, the background of wildlife images captured by camera traps is complex, which poses a challenge for identifying wild animals in images with a large number of images and a wide variety of wildlife species. Based on convolutional neural network, this research aims to improve the existing structure and so as to implement the automatic recognition for wildlife images. Method: In this study, 2 712 wildlife images of 8 categories were taken from Wuling Mountain Beijing Nature Reserve, Miyun Districts, Beijing. The Auto Augment policy was randomly selected from 14 augmentation policies to add noise to the images. SENet and BlurPool were used to construct an improved network based on ResNeXt-50: SE-ResNeXt-50 for enhancement feature extraction, BP-ResNeXt-50 for Shift-invariance maintenance, and BS-ResNeXt-50 for both. The influences of fixed learning rate, segmented learning rate, and cosine annealing learning rate on the accuracy of the BS ResNeXt-50 network were tested on the self-built dataset. VGG16, ResNeXt-50, EfficientNet-B0, InceptionV3, DenseNet-121, and BS-ResNeXt-50 were used to train on 16 common categories of images in CCT public wildlife dataset, and the recognition accuracy of single species was compared.e influences of fixed learning rate, segmented learning rate, and cosine annealing learning rate on the accuracy of the BS ResNeXt-50 network were tested on the self-built dataset. VGG16, ResNeXt-50, EfficientNet-B0, InceptionV3, DenseNet-121, and BS-ResNeXt-50 were used to train on 16 common categories of images in CCT public wildlife dataset, and the recognition accuracy of single species was compared.eXt-50 is used to test influence of different learning rate include fixed and CosineAnnealing learning rate on collected dataset. VGG16, ResNeXt-50, EfficientNet-B0, InceptionV3, DenseNet-121, BS-ResNeXt-50 were used for training on CCT dataset, and the recognition accuracy of single species was compared. on ResNeXt-50: SE-ResNeXt-50 for enhancement feature extraction, BP-ResNeXt-50 for Shift-invariance maintenance, and BS-ResNeXt-50 for both. The influences of fixed learning rate, segmented learning rate, and cosine annealing learning rate on the accuracy of the BS ResNeXt-50 network were tested on the self-built dataset. VGG16, ResNeXt-50, EfficientNet-B0, InceptionV3, DenseNet-121, and BS-ResNeXt-50 were used to train on 16 common categories of images in CCT public wildlife dataset, and the recognition accuracy of single species was compared.eXt-50 is used to test influence of different learning rate include fixed and CosineAnnealing learning rate on collected dataset. VGG16, ResNeXt-50, EfficientNet-B0, InceptionV3, DenseNet-121, BS-ResNeXt-50 were used for training on CCT dataset, and the recognition accuracy of single species was compared. Result: The accuracy of SE-ResNeXt-50 and BP-ResNeXt-50 reached 75.16%±0.14% and 73.74%±0.13%, respectively. The enhanced scheme BS-ResNeXt-50, which integrated SENet and BlurPool, achieved an accuracy of 78.04%±0.11% when tested on a self built dataset, which was the best improved scheme. When the cosine annealing learning rate is used, the accuracy of BS-ResNeXt-50 was improved to 81.54%, which was 3.5% higher than that with the constant learning rate. The step decay learning rate achieved 79.3% accuracy, which was 2.24% less than the cosine annealing learning rate. The classification accuracy of BS-ResNeXt-50 was able to reach 95.07%, which was 1.95% higher than that of ResNeXt-50 on CCT dataset. At the same time, it was also 85.5% higher than that of VGG16, 91.38% higher than that of EfficientNet-B0, 91.38% higher than that of InceptionV3 and 93.3% higher than that of DenseNet-121. The prediction accuracy of each single category was also higher than that of the above model. In the recognition of a single category, except for the least one category, the accuracy of BS-ResNeXt-50 was 90% higher than that in other categories, and the highest category accuracy was 98.6%. Conclusion: The BS-ResNeXt-50 can more accurately complete the recognition task, and also has good generalization ability on different datasets.

Key words: wildlife images, species recognition, deep learning, convolutional neural network

中图分类号: