Welcome to visit Scientia Silvae Sinicae,Today is

Scientia Silvae Sinicae ›› 2023, Vol. 59 ›› Issue (1): 119-127.doi: 10.11707/j.1001-7488.LYKX20210886

• Research papers • Previous Articles     Next Articles

Voiceprint Recognition of Male Nomascus hainanus Based on Convolutional Neural Network

Huimin Feng,Kun Jin*   

  1. Ecology and Nature Conservation Institute, Chinese Academy of Forestry Research Institute of Natural Protected Area, Chinese Academy of Forestry Key Laboratory of Biodiversity Conservation of National Forestry and Grassland Administration Beijing 100091
  • Received:2021-11-30 Online:2023-01-25 Published:2023-02-24
  • Contact: Kun Jin

Abstract:

Objective: Nomascus hainanus is an endemic and critically endangered species in China. They inhabit in dense forests in Hainan Tropical Rainforest National Park, and singing is an important part of the behavior of N. hainanus. This study aims to identify male N. hainanus individuals by their song, so as to provide support for intelligent perception and monitoring of N. hainanus population in the future and construction of intelligent protected areas in Hainan Tropical Rainforest National Park. Method: Many studies have proved that the vocal sounds of some species have individual differences, which can be used as a kind of acoustic fingerprint to identify species individuals. In this paper, by studying the characteristics of the song spectrum of male N. hainanus and the basic principle of voiceprint recognition, a method of voiceprint recognition was proposed based on Convolutional Neural Network. The active acoustic monitoring and passive acoustic monitoring of two kinds of methods were used to collect the original data of N. hainanus songs, and the original data were preprocessed, and the phonograms of the combination of FM notes in the song phrase of seven male N. hainanus were used as input. By building CNN and Residual CNN models, the voiceprint features of seven male N. hainanus in five populations were extracted and classified to realize individual recognition. Result: The five fold cross validation showed that the recognition accuracy of CNN model was 91.2%, the standard deviation of recognition effect was 4.24%, and the inference time was 40 ms. The recognition accuracy of Residual CNN model was 95.04%, the standard deviation of recognition effect was 2.97%, and the reasoning time was 120 ms. Compared with CNN, Residual CNN had higher recognition accuracy and more stable classification effect, but it took longer time to calculate. Conclusion: The actual verification results show that the CNN model and Residual CNN model are reliable for the classification and individual recognition of male N. hainanu by their song spectrograms, and this method can be applied to the voiceprint recognition of Hainan gibbon. Compared with CNN, the Residual CNN model has better recognition stability, and the classification effect is improved by 3.84% to 95.04%. However, from the perspective of application, compared with Residual CNN, CNN model has lower training cost, faster inference calculation speed, and the accuracy and prediction stability can meet the application requirements. The voiceprint recognition method based on Convolutional Neural Network overcomes the limitations of calculation and data set in many existing methods, and provides a better solution for the voiceprint recognition research of other species in the future.

Key words: N. hainanus, Hainan Tropical Rainforest National Park, spectrogram, Convolutional Neural Network, voiceprint recognition

CLC Number: