欢迎访问林业科学,今天是

林业科学 ›› 2025, Vol. 61 ›› Issue (6): 25-37.doi: 10.11707/j.1001-7488.LYKX20240518

• 研究论文 • 上一篇    下一篇

基于RT-DETR的林间松果检测方法

吴晨旭(),张冬妍*(),张榄翔,陈诺,毛思雨   

  1. 东北林业大学计算机与控制工程学院 哈尔滨 150040
  • 收稿日期:2024-09-05 出版日期:2025-06-10 发布日期:2025-06-26
  • 通讯作者: 张冬妍 E-mail:a2022111824@nefu.edu.cn;nefuzdhzdy@nefu.edu.cn
  • 基金资助:
    中央高校基本科研业务费专项资金项目“基于近红外光谱技术的红松籽和榛子无损检测技术研究” (2572019BF02)。

Detection Method of Pinecones in the Forest Based on RT-DETR

Chenxu Wu(),Dongyan Zhang*(),Lanxiang Zhang,Nuo Chen,Siyu Mao   

  1. College of Computer and Control Engineering, Northeast Forestry University  Harbin 150040
  • Received:2024-09-05 Online:2025-06-10 Published:2025-06-26
  • Contact: Dongyan Zhang E-mail:a2022111824@nefu.edu.cn;nefuzdhzdy@nefu.edu.cn

摘要:

目的: 针对林间环境复杂、小目标松果纹理特征不明显,导致检测精度不足和检测实时性差的问题,提出一种基于Real-time detection transformer(RT-DETR)的林间松果检测方法,并针对RT-DETR模型进行优化,提升其检测性能。方法: 首先,为了提升检测精度,将原主干网络替换为Re-parameterized vision transformer(RepViT),以增强特征提取能力。其次,引入High-low frequency feature interactions(HiLo)高低频分离机制,提高细节纹理的捕捉能力。最后,将Re-parameterized cross stage partial bottleneck with 3 convolutions(RepC3)模块优化为Decoupled replicated bottleneck cross stage partial with 3 convolutions(DRBC3),通过融合大核卷积与扩张卷积,显著扩大其感受野。与此同时,RepViT和DRBC3均采用结构重参数设计,使得推理时模型结构得以简化,从而提升检测效率。结果: 经过优化的RT-DETR模型,针对中国黑龙江省佳木斯大来林场收集的松果图像数据集的测试结果表明,模型的各项指标均达到最佳平衡,其中AP50达到93.37%,精度和召回率分别为93.30%和92.65%。在AP50提升5%的同时,GFLOPs降低了51%,参数量减少了41%,实时帧率FPS从74.3显著提升至95.5,提升幅度达到28%。结论: 这一优化方法显著提高林间松果检测的精度、实时性和效率,为实际应用中的自动化松果采集任务提供了有效的解决方案。

关键词: RT-DETR, 松果检测, RepViT, HiLo高低频分离机制, DRBC3

Abstract:

Objective: In this study, a forest pinecone detection method based on real-time detection transformer (RT-DETR) was proposed to address the challenges of complex forest environments, small pinecones with indistinct texture features, leading to insufficient detection accuracy and poor real-time detection performance. The RT-DETR model has been optimized to enhance detection performance. Method: Firstly, to improve detection accuracy, the original backbone network was replaced with the re-parameterized vision transformer (RepViT) to enhance feature extraction capability. Secondly, the high-low frequency feature interactions (HiLo) mechanism was introduced to improve the capture of fine texture details. Finally, the re-parameterized cross stage partial bottleneck with 3 convolutions (RepC3) module was optimized into the decoupled replicated bottleneck cross stage partial with 3 convolutions (DRBC3). The receptive field was significantly expanded by incorporating large kernel convolutions and dilated convolutions. Meanwhile, both RepViT and DRBC3 adopted structural re-parameterization designs, simplifying the model structure during inference, and thus improving detection efficiency. Result: The optimized RT-DETR model was tested on the pinecone image dataset collected from Dalai forest station in Jiamusi, Heilongjiang Province, China, and the result showed that all metrics of the model achieved optimal balance, with AP50 of 93.37%, a precision of 93.30%, and a recall of 92.65%. While AP50 improved by 5%, GFLOPs were reduced by 51%, the number of parameters decreased by 41%, and the real-time frame rate FPS significantly increased from 74.3 to 95.5, representing a 28% improvement. Conclusion: This optimization method significantly improves the accuracy, real-time performance, and efficiency of pinecone detection in forest environments, providing an effective solution for automated pinecone harvesting tasks in practical applications.

Key words: RT-DETR, pinecone detection, RepViT, HiLo high-low frequency separation mechanism, DRBC3

中图分类号: