欢迎访问林业科学,今天是

林业科学 ›› 2024, Vol. 60 ›› Issue (8): 25-32.doi: 10.11707/j.1001-7488.LYKX20230399

• 前沿与重点:智慧林草技术与应用 • 上一篇    下一篇

基于Wasserstein距离和相关对齐迁移学习的野生动物图像识别方法

张长春,李大方,张军国*   

  1. 北京林业大学工学院 林木资源高效生产全国重点实验室 林业装备与自动化国家林业和草原局重点实验室 北京100083
  • 收稿日期:2023-08-31 出版日期:2024-08-25 发布日期:2024-09-03
  • 通讯作者: 张军国
  • 基金资助:
    北京市自然科学基金项目(6244053);国家自然科学基金项目 (32371874);中央高校基本科研业务费专项 (BLX202129);国家林业和草原局林业科技成果推广计划([2019]04)。

Wildlife Images Recognition Method Based on Wasserstein Distance and Correlation Alignment Transfer Learning

Changchun Zhang,Dafang Li,Junguo Zhang*   

  1. School of Technology, Beijing Forestry University State Key Laboratory of Efficient Production of Forest Resources Key Laboratory of National Forestry and Grassland Administration on Forestry Equipment and Automation Beijing 100083
  • Received:2023-08-31 Online:2024-08-25 Published:2024-09-03
  • Contact: Junguo Zhang

摘要:

目的: 为解决光照、背景和拍摄尺度等复杂因素对野生动物图像识别准确性的影响。方法: 以野外红外触发相机采集的野生动物图像为对象:1)利用ENA24和NACTI两个公开的野生动物数据集构建包含不相交领域数据集S1和S2,共11个动物类别,包含25 591幅图像;2)针对领域偏移问题,采用ResNet50网络作为特征提取模块构建领域对抗网络,有效减轻了领域偏移;3)引入Wasserstein距离和相关对齐的表征学习网络,建立了基于Wasserstein距离和相关对齐的迁移学习网络,用于特征提取和识别,挖掘迁移性的特征。结果: 采用平均准确率作为评价指标,ResNet50、DDC、DCORAL、DAN、DANN、CDAN、HAN和JTN这8个模型在11种野生动物上的识别表现分别为48.4%、51.6%、49.6%、52.6%、45.2%、50.9%、54.6%和53.5%;在改进残差模块的ResNet50基础上,并引入Wasserstein距离和相关对齐的表征学习网络后,与现有最佳方法相比,11种(类)野生动物的平均准确率提高了2.7%。结论: 基于Wasserstein距离和相关对齐的迁移学习方法在野生动物识别方面的平均准确率达到57.3%;引入Wasserstein距离和相关对齐的表征学习可有效提高野生动物识别模型的准确性。

关键词: 野生动物, Wasserstein距离, 相关对齐, 迁移学习, 图像识别

Abstract:

Objective: This study aims to address the influence of complex factors such as lighting, background, and shooting scale on the accuracy of wildlife image recognition. Method: In this study, the wild animal images captured by infrared triggered cameras in the wild were used as the object: 1) Two publicly available wildlife datasets, ENA24 and NACTI, were used to construct disjoint datasets S1 and S2, comprising a total of 11 animal categories and 25 591 images. 2) To tackle domain shift issues, a ResNet50 network was utilized as a feature extraction module to build a domain adversarial network, effectively alleviating domain bias. 3) A representation learning network incorporating Wasserstein distance and correlation alignment was proposed to establish a transfer learning network for feature extraction and recognition, so as to further exploit transferable features. Result: The performance of different models in wildlife recognition was evaluated using the average accuracy metric. Results indicated that the average accuracy on 11 wildlife categories for eight models, namely ResNet50, DDC, DCORAL, DAN, DANN, CDAN, HAN, and JTN, was 48.4%, 51.6%, 49.6%, 52.6%, 45.2%, 50.9%, 54.6%, and 53.5%, respectively. Upon enhancing the ResNet50 base model with improved residual modules and introducing a representation learning network incorporating Wasserstein distance and correlation alignment, the average accuracy for 11 wildlife categories was improved by 2.7% compared to the existing best result with the comparative methods. Conclusion: The transfer learning method based on Wasserstein distance and correlation alignment has achieved an average accuracy of 57.3% in wildlife recognition. The introduction of representation learning based on Wasserstein distance and correlation alignment can effectively improve the accuracy of the wildlife recognition model.

Key words: wild animal, Wasserstein distance, correlation alignment, transfer learning, image recognition

中图分类号: