Welcome to visit Scientia Silvae Sinicae,Today is

Scientia Silvae Sinicae ›› 2026, Vol. 62 ›› Issue (4): 194-205.doi: 10.11707/j.1001-7488.LYKX20250523

• Research papers • Previous Articles     Next Articles

Detection Method of Unknown Wildlife Species Based on Vision-Language Feature Matching

Zihe Yang1,Ye Tian1,*(),Jiantao Wang2,Zhiyong Pei3,Jing Sun4,Junguo Zhang1,5,*()   

  1. 1. School of Technology, Beijing Forestry University State Key Laboratory of Efficient Production of Forest Resources Key Laboratory of National Forestry and Grassland Administration on Forestry Equipment and Automation Beijing 100083
    2. Administration Bureau of Inner Mongolia Wulanba National Nature Reserve Chifeng 025450
    3. College of Energy and Transportation Engineering, Inner Mongolia Agricultural University Hohhot 010018
    4. Xing’an League Wulanhe Local Nature Reserve Administration Ulanhot 137726
    5. Shaanxi Institute of Zoology Xi’an 710032
  • Received:2025-08-25 Online:2026-04-15 Published:2026-04-11
  • Contact: Ye Tian,Junguo Zhang E-mail:tytoemail@sina.com;zhangjunguo@bjfu.edu.cn

Abstract:

Objective: In response of the problem of low recognition rate of unknown categories in infrared camera monitoring images of wildlife in open environments, a method for unknown category detection is proposed that does not rely on explicit environmental descriptions or habitat metadata, but only relies on known species labels. This method is designed for adapting to the common scenario of limited information in real monitoring dataset. Method: An envisioning unknown animal (EUA) method was proposed based on visual language feature matching, and the method integrated the ecological reasoning capability of large language model (LLM) with the cross-modal alignment of vision-language models to construct a monitoring framework for open environments. First, an ecologically-informed prompt was designed to guide the LLM to infer the regional ecological context solely from the set of known species sets and generate a list of potential species with ecological plausibility. Second, the text descriptions of these potential species were combined with known categories to construct an expanded vision-language semantic space. Finally, an outlier detection score (ODS) mechanism was introduced to robustly detect unknown categories by calculating the deviation in matching distribution of images between known categories and potential species. Result: Experiments on two public datasets, Dataset3 (D3) and North American Camera Trap Images (NACTI), demonstrated that EUA significantly outperformed existing methods. In the most challenging scenario with 5 unknown categories, the average false positive rate at 95% true positive rate (FPR@95TPR) of EUA was 57.86%, which was 16.19 percentage points lower than the suboptimal method. The area under the receiver operating characteristic curve (AUC) reached 84.31%, representing a 4.64 percentage point improvement. Ablation experiment confirmed that the ecologically-guided potential species generation and the scoring ODS mechanism were the core drivers of this performance gain. Visualization analysis further showed that EUA effectively separated the distributions of known and unknown samples, validating the effectiveness of the design. Conclusion: This study achieves a paradigm shift from “passive classification” to “proactive prediction”, providing an effective solution to the problem of unknown category detection in real-world monitoring scenarios lacking environmental priors. The EUA method not only achieves a breakthrough in performance, but also explores a feasible path for embedding ecological knowledge into AI reasoning processes, offering a new direction for building the next generation of ecologically-aware intelligent wildlife monitoring systems.

Key words: wildlife monitoring, unknown category detection, large language models, vision-language models, ecological reasoning

CLC Number: