Scientia Silvae Sinicae ›› 2024, Vol. 60 ›› Issue (8): 33-45.doi: 10.11707/j.1001-7488.LYKX20230276
• Technology and application of smart forestry and grassland • Previous Articles Next Articles
Jiandong Qi1,2,Shangzi Zheng1,3,Ziyi Chen1,Zhongtian Ma1
Received:
2023-07-01
Online:
2024-08-25
Published:
2024-09-03
CLC Number:
Jiandong Qi,Shangzi Zheng,Ziyi Chen,Zhongtian Ma. Wildlife Image Recognition of Infrared Cameras in Beijing Area Based on an Improvement ConvNeXt Model[J]. Scientia Silvae Sinicae, 2024, 60(8): 33-45.
Table 1
Number of species in the adjusted self-built dataset"
动物种类Animal species | 图像数量Image number |
猪獾Arctonyx collaris | 252 |
鸟(不含鸭类)Aves (not included mallard) | 1 071 |
野猪Sus scrofa | 112 |
豹猫Prionailurus bengalensis | 119 |
鹿Cervus axis | 1 199 |
山羊Capra hircus | 332 |
野狗Canis lupus familiaris | 105 |
野兔Lepus sinensis | 126 |
松Sciurus vulgaris | 300 |
鸭类Mallard | 619 |
总计Total | 4 234 |
Table 2
Number of species categories in SS dataset subset and CCT dataset subset"
SS数据集子集SS dataset subset | CCT数据集子集CCT dataset subset | |||
动物种类Animal species | 图像数量Image number | 动物种类Animal species | 图像数量Image number | |
转角牛羚Damaliscus lunatus | 571 | 浣熊Procyon lotor | 1 101 | |
鸟类Aves | 980 | 鸟类Aves | 982 | |
长颈鹿 Giraffa camelopardalis | 1 000 | 狗Canis dingo | 419 | |
斑马Equus burchellii | 1 000 | 啮齿动物(不含松鼠)Geomys bursarius (not included squirrel) | 464 | |
大羚羊Oryx | 1 000 | 猫Prionailurus bengalensis | 543 | |
水牛Bubalus bubalus | 1 000 | 鹿Cervus axis | 1 256 | |
警犬Canis lupus familiaris | 1 000 | 郊狼Canis latrans | 1 720 | |
大象Elephas maximus | 1 000 | 牛Bos taurus | 332 | |
珠鸡Numididae | 1 000 | 野猫Prionailurus bengalensis | 789 | |
鬣狗Hyaenidae | 1 000 | 松鼠Sciurus carolinesis | 445 | |
非洲旋角大羚羊Addax nasomaculatus | 1 000 | 臭鼬Mephitis mephitis | 180 | |
黑斑羚Aepyceros melampus | 1 000 | 狐狸Vulpes vulpes | 239 | |
瞪羚Gazella | 1 000 | 野兔Lepus sinensis | 1 237 | |
黑尾牛羚Connochaetes taurinus | 1 000 | 负鼠Didelphis virginiana | 1 622 | |
总计Total | 13 551 | 总计Total | 11 329 |
Table 3
Configuration for ConvNeXt models with 5 different parameter scales"
模型 Model | 输入的通道数量 Number of input channels | 重复堆叠次数 The number of times to repeat stacking |
ConvNeXt-T | (96, 192, 384, 768) | (3, 3, 9, 3) |
ConvNeXt-S | (96, 192, 384, 768) | (3, 3, 27, 3) |
ConvNeXt-B | (128, 256, 512, 1 024) | (3, 3, 27, 3) |
ConvNeXt-L | (192, 384, 768, 1 536) | (3, 3, 27, 3) |
ConvNeXt-XL | (256, 512, 1 024, 2 048) | (3, 3, 27, 3) |
Table 4
Five options and ablation research results"
方案号 Scheme No. | 模型 Model | 乘加累积操作数 MACs | 参数数量 Params | 准确率 Accuracy(%) |
原始Oringinal | ConvNeXt-T | 4.47×109 | 27.83×106 | 74.13 |
1 | ConvNeXt-T+BP | 1.07×109 | 27.83×106 | 76.39 |
2 | ConvNeXt-T+SENet | 4.47×109 | 28.24×106 | 77.34 |
3 | ConvNeXt-T+GRN-缩放层 ConvNeXt-T+GRN-scale layer | 4.47×109 | 27.88×106 | 87.18 |
4 | ConvNeXt-T+GCNet | 4.47×109 | 28.25×106 | 75.44 |
5 | ConvNeXt-T+ BSGG-ConvNeXt | 1.07×109 | 28.71×106 | 83.63 |
Table 5
Experimental results of ConvNeXt with different parameter scales on a self built dataset"
模型 Model | 乘加累积操作数 MACs | 参数数量 Params | 准确率 Accuracy(%) |
ConvNeXt-T | 4.47×109 | 27.83×106 | 69.40 |
BSGG-ConvNeXt-T | 1.07×109 | 28.71×106 | 83.63 |
ConvNeXt-S | 8.7×109 | 49.46×106 | 73.43 |
BSGG-ConvNeXt-S | 1.26×109 | 51.08×106 | 83.39 |
ConvNeXt-B | 15.38×109 | 87.68×106 | 74.02 |
BSGG-ConvNeXt-B | 2.21×109 | 90.38×106 | 82.70 |
ConvNeXt-L | 34.4×109 | 196.25×106 | 78.13 |
BSGG-ConvNeXt-L | 4.90×109 | 202.42×106 | 80.31 |
Table 6
Recognition results of different models on self-built datasets"
模型 Model | 乘加累积操作数MACs | 参数数量 Params | 准确率 Accuracy(%) |
ResNet-50 | 4.12×109 | 23.54×106 | 76.39 |
ResNeXt-50 | 4.27×109 | 23.00×106 | 87.60 |
MobileVIT | 261.28×106 | 954.23×103 | 88.85 |
DenseNet | 2.88×109 | 6.69×106 | 87.66 |
RegNet | 503.13×106 | 3.91×106 | 69.70 |
EfficientNetv2 | 2.87×109 | 343.05×103 | 56.22 |
SwinTransformer | 4.36×109 | 27.53×106 | 86.23 |
ConvNeXtv2 | 4.47×109 | 27.87×106 | 91.93 |
MobileOne | 1.09×109 | 4.28×106 | 71.53 |
BSGG-ConvNeXt-T | 1.07×109 | 28.71×106 | 83.63 |
MobileVIT+预训练权重 MobileVIT+pre-training weight | 261.28×106 | 954.23×103 | 91.70 |
RegNet+预训练权重 RegNet +pre-training weight | 503.13×106 | 3.91×106 | 93.20 |
BSGG-ConvNeXt-T+ 预训练权重 BSGG-ConvNeXt-T +pre-training weight | 1.07×109 | 28.71 ×106 | 94.07 |
Table 7
Recognition results of different models on subsets of SS and CCT datasets"
模型 Model | 数据集 Dataset | 乘加累积操作数MACs | 参数数量 Params | 准确率 Accuracy(%) |
ConvNeXt-T | SS数据集子集 SS dataset subset | 4.47×109 | 27.83×106 | 48.23 |
BSGG-ConvNeXt-T | SS数据集子集 SS dataset subset | 1.07×109 | 28.71×106 | 50.28 |
ConvNeXt-T | CCT数据集子集 CCT dataset subset | 4.47×109 | 27.83×106 | 45.75 |
BSGG-ConvNeXt-T | CCT数据集子集 CCT dataset subset | 1.07×109 | 28.71×106 | 56.15 |
何 嘉. 2019. 基于深度学习的野生动物智能检测与识别. 深圳: 深圳大学. | |
He J. 2019. Wildlife smart detection and recognition based on deep learning. Shenzhen : Shenzhen University. [in Chinese] | |
邱志斌, 石大寨, 况燕军, 等. 基于深度迁移学习的输电线路涉鸟故障危害鸟种图像识别. 高电压技术, 2021, 47 (11): 3785- 3794. | |
Qiu Z B, Shi D Z, Kuang Y J, et al. Image recognition of harmful bird species related to transmission line outages based on deep transfer learning. High Voltage Engineering, 2021, 47 (11): 3785- 3794. | |
汪国海, 李生强, 施泽攀, 等. 广西猫儿山自然保护区的兽类和鸟类多样性初步调查——基于红外相机监测数据. 兽类学报, 2016, 36 (3): 338- 347. | |
Wang G H, Li S Q, Shi Z P, et al. Preliminary survey of mammal and bird diversity of Guangxi Mao’ershan National Nature Reserve: based on infrared camera monitoring. Acta Theriologica Sinica, 2016, 36 (3): 338- 347. | |
杨铭伦, 张 旭, 郭 颖, 等. 基于YOLOv5的红外相机野生动物图像识别. 激光与光电子学进展, 2022, 59 (12): 382- 390. | |
Yang M L, Zhang X, Guo Y, et al. Recognition of wild animals using infrared camera images based on YOLOv5. Laser & Optoelectronics Progress, 2022, 59 (12): 382- 390. | |
袁东芝. 2018. 基于卷积神经网络的动物识别算法研究. 广州: 华南理工大学. | |
Yuan D Z. 2018. Research on animal recognition algorithm based on convolutional neural network. Guangzhou: South China University of Technology. [in Chinese] | |
于莉莉. 2017. 陆生野生动物保护对生物多样性的影响机理及对策. 南京: 南京林业大学. | |
Yu L L. 2017. Effects of terrestrial wildlife conservation on biodiversity and countermeasures. Nanjing: Nanjing Forestry University. [in Chinese] | |
张 毓, 高雅月, 常峰源, 等. 小样本条件下基于数据扩充和ResNeSt的雪豹识别. 北京林业大学学报, 2021, 43 (10): 89- 99.
doi: 10.12171/j.1000-1522.20210185 |
|
Zhang Y, Gao Y Y, Chang F Y, et al. Panthera unica recognition based on data expansion and ResNeSt with few samples. Journal of Beijing Forestry University, 2021, 43 (10): 89- 99.
doi: 10.12171/j.1000-1522.20210185 |
|
Beery S, Van Horn G, Perona P. 2018. Recognition in terra incognita. Proceedings of the European Conference on Computer Vision (ECCV). Cham: Springer, 472−489. | |
Brock A, De S, Smith S L, et al. 2021. High-performance large-scale image recognition without normalization. Proceedings of the 38th International Conference on Machine Learning Research (PMLR), 1059−1071. | |
Chen G B, Han T X, He Z H, et al. Deep convolutional neural network based species recognition for wild animal monitoring. 2014 IEEE International Conference on Image Processing (ICIP). Paris, 2014, France, 858- 862. | |
Ding X H, Zhang X Y, Han J G, et al. Scaling up your kernels to 31 × 31: revisiting large kernel design in CNNs. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, 2022, LA, USA,11953- 11965. | |
Dosovitskiy A, Beyer L, Kolesnikov A, et al. 2020. An image is worth 16 × 16 words: transformers for image recognition at scale. arXiv: 2010.11929. | |
Girshick R, Donahue J, Darrell T, et al. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 580-587. | |
Gomez Villa A, Salazar A, Vargas F. Towards automatic wild animal monitoring: Identification of animal species in camera-trap images using very deep convolutional neural networks. Ecological Informatics, 2017, 41, 24- 32.
doi: 10.1016/j.ecoinf.2017.07.004 |
|
He K M, Zhang X Y, Ren S Q, et al. 2016. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV: USA,770−778. | |
Howard A G, Zhu M L, Chen B, et al. 2017. MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv: 1704.04861. | |
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018, UT, USA,7132- 7141. | |
Karanth K U. Estimating tiger Panthera tigris populations from camera-trap data using capture-recapture models. Biological Conservation, 1995, 71, 333- 338.
doi: 10.1016/0006-3207(94)00057-W |
|
Kays R, McShea W J, Wikelski M. Born-digital biodiversity data: Millions and billions. Diversity and Distributions, 2020, 26 (5): 644- 648.
doi: 10.1111/ddi.12993 |
|
Krizhevsky A, Sutskever I, Hinton G E. 2012. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25. | |
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. 2017 IEEE International Conference on Computer Vision (ICCV). Venice, 2017, Italy, 2999- 3007. | |
Liu W, Anguelov D, Erhan D, et al. 2016. SSD: single shot MultiBox detector. European Conference on Computer Vision. Cham: Springer, 21−37. | |
Liu Z, Lin Y T, Cao Y, et al. 2021. Swin Transformer: hierarchical vision transformer using shifted windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC: Canada,9992−10002. | |
Liu Z, Mao H Z, Wu C Y, et al. 2022. A ConvNet for the 2020s. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA: USA,11966−11976. | |
Niedballa J, Sollmann R, Mohamed A B, et al. Defining habitat covariates in camera-trap based occupancy studies. Scientific Reports, 2015, 5, 17041.
doi: 10.1038/srep17041 |
|
Norouzzadeh M S, Nguyen A, Kosmala M, et al. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proceedings of the National Academy of Sciences of the United States of America, 2017, 115 (25): E5716- E5725. | |
O’Connell A F, Nichols J D, Karanth K U. 2011. Camera traps in animal ecology: methods and analyses. Springer, New York. | |
Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149.
doi: 10.1109/TPAMI.2016.2577031 |
|
Schneider S, Greenberg S, Taylor G W, et al. Three critical factors affecting automated image species recognition performance for camera traps. Ecology and Evolution, 2020, 10 (7): 3503- 3517.
doi: 10.1002/ece3.6147 |
|
Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556. | |
Swanson A, Kosmala M, Lintott C, et al. Snapshot Serengeti, high-frequency annotated camera trap images of 40 mammalian species in an African savanna. Scientific Data, 2015, 2, 150026.
doi: 10.1038/sdata.2015.26 |
|
Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, 2015, MA, USA,1- 9. | |
Tan M X, Le Q V. 2019. EfficientNet: rethinking model scaling for convolutional neural networks. International Conference on Machine Learning, 6105-6114. | |
Van Horn G, Mac Aodha O, Song Y, et al. 2018. The iNaturalist species classification and detection dataset. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT. IEEE, 132−139. | |
Vecvanags A, Aktas K, Pavlovs I, et al. Ungulate detection and species classification from camera trap images using RetinaNet and faster R-CNN. Entropy, 2022, 24 (3): 353.
doi: 10.3390/e24030353 |
|
Wang M J, Li Y D, Zhou J, et al. 2023. GCNet: probing self-similarity learning for generalized counting network. arXiv: 2302.05132. | |
Wang X L, Girshick R, Gupta A, et al. Non-local neural networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018, UT, USA,7794- 7803. | |
Woo S, Debnath S, Hu R H, et al. 2023. ConvNeXt V2: co-designing and scaling ConvNets with masked autoencoders. arXiv: 2301.00808. | |
Xie S N, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017, HI, USA,5987- 5995. | |
Zhang R. Making convolutional networks shift-invariant again. International Conference on Machine Learning, 2019, 97, 7324- 7334. |
[1] | Changchun Zhang,Dafang Li,Junguo Zhang. Wildlife Images Recognition Method Based on Wasserstein Distance and Correlation Alignment Transfer Learning [J]. Scientia Silvae Sinicae, 2024, 60(8): 25-32. |
[2] | Jingyi Xu,Zhi Zhang,Fei Yan,Wenyue Zhang. Leaf Identification Based on GAN-DCNN [J]. Scientia Silvae Sinicae, 2024, 60(4): 40-51. |
[3] | Wenhan Yang,Tianyu Liu,Junchi Zhou,Wenwu Hu,Ping Jiang. CNN-Swin Transformer Detection Algorithm of Forest Wildlife Images Based on Improved YOLOv5s [J]. Scientia Silvae Sinicae, 2024, 60(3): 121-130. |
[4] | Su Tong, Xu Jie. Tree Species Identification Method Based on Generative Adversarial Network [J]. Scientia Silvae Sinicae, 2024, 60(2): 97-105. |
[5] | Jiandong Qi,Zhongtian Ma,Dehuai Zhang,Yun Tian. Wildlife Image Recognition in Miyun District Based on BS-ResNeXt-50 [J]. Scientia Silvae Sinicae, 2023, 59(8): 112-122. |
[6] | Yujie Miao,Shiping Zhu,Jing Pu,Junxian Li,Lingkai Ma,Hua Huang. Recognition of Furniture Wood Image Species Based on Convolutional Neural Networks [J]. Scientia Silvae Sinicae, 2023, 59(8): 133-140. |
[7] | Yingwu Mao,Ying Guo,Wangfei Zhang,Yong Su,Yuan Guan. Tree Species Classification by Combining LiDAR, Hyperspectral Data and 3D-CNN Method [J]. Scientia Silvae Sinicae, 2023, 59(3): 73-83. |
[8] | Jiajie Su,Zheyu Zhang,Jiajun Xu,Bin Li,Jun Lü,Qing Yao. Forest Pest Identification Method Based on a Deep Bilinear Transformation Attention Mechanism Network [J]. Scientia Silvae Sinicae, 2023, 59(2): 121-128. |
[9] | Junfeng Chen,Yi Xie. Wildlife Accident, Compensation for Damage Caused by Wildlife and Farmers’ Willingness to Protect Wildlife [J]. Scientia Silvae Sinicae, 2023, 59(12): 152-166. |
[10] | Yuxuan Hu,Junfeng Chen,Yi Xie. Measures for Governing Human-Elephant Conflicts Based on Choice Experiment of Farmers in Xishuangbanna [J]. Scientia Silvae Sinicae, 2023, 59(10): 162-170. |
[11] | Huimin Feng,Kun Jin. Voiceprint Recognition of Male Nomascus hainanus Based on Convolutional Neural Network [J]. Scientia Silvae Sinicae, 2023, 59(1): 119-127. |
[12] | Jia Li,Lan Lan,Zuozhong Zhang,Wentao Yuan,Demin Gao,Shuqin Zong,Qiaolin Ye. Inversion Technology of Forest Fuel Moisture Content Based on Deep Learning [J]. Scientia Silvae Sinicae, 2022, 58(10): 47-58. |
[13] | Tuo He,Shoujia Liu,Yang Lu,Yonggang Zhang,Lichao Jiao,Yafang Yin. iWood: An Automated Wood Identification System for Endangered and Precious Tree Species Using Convolutional Neural Networks [J]. Scientia Silvae Sinicae, 2021, 57(9): 152-159. |
[14] | Ziyu Zhao,Xiaoxia Yang,Hui Guo,Zhedong Ge,Yucheng Zhou. Recognition Method of Wood Macro- and Micro-Structure Based on Convolution Neural Network [J]. Scientia Silvae Sinicae, 2021, 57(6): 134-143. |
[15] | Yan Zhou,Wenping Liu,Youqing Luo,Shixiang Zong. Small Object Detection for Infected Trees Based on the Deep Learning Method [J]. Scientia Silvae Sinicae, 2021, 57(3): 98-107. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||