3-D Gravity Anomaly Inversion for Imaging Salt Structures, With Applicati...
Full-Stokes polarization photodetector based on the hexagonal lattice chi...
Multiple k-Point Nonadiabatic Molecular Dynamics for Ultrafast Excitation...
Low-loss silica-based 90 deg optical hybrid in c band based on 4 x 4 mult...
Ultra-broad bandwidth array waveguide grating for high-speed backbone net...
Optical time domain reflectometry based on a self-chaotic circular-sided ...
 ICGNet: An intensity-controllable generation network based on covering ...
Dual-polarization near-infrared narrow-band unidirectional nonreciprocal ...
Design and Optimization of a High-Efficiency 3D Multi-Tip Edge Coupler Ba...
 Numerical Simulation and Experimental Investigation of ps Pulsed Laser ...
官方微信
友情链接

CDPNet: conformer-based dual path joint modeling network for bird sound recognition

2024-03-22


Author(s): Guo, HM (Guo, Huimin); Jian, HF (Jian, Haifang); Wang, YY (Wang, Yiyu); Wang, HC (Wang, Hongchang); Cheng, QH (Cheng, Qinghua); Zheng, SK (Zheng, Shuaikang); Li, YH (Li, Yuehao)

Source: APPLIED INTELLIGENCEDOI: 10.1007/s10489-024-05362-9  Early Access Date: MAR 2024  

Abstract: Bird species monitoring is important for the preservation of biological diversity because it provides fundamental information for biodiversity assessment and protection. Automatic acoustic recognition is considered to be an essential technology for realizing automatic monitoring of bird species. Current deep learning-based bird sound recognition methods do not fully conduct long-term correlation modeling along both the time and frequency axes of the spectrogram. Additionally, these methods have not completely studied the impact of different scales of features on the final recognition. To solve the abovementioned problems, we propose a Conformer-based dual path joint modeling network (CDPNet) for bird sound recognition. To the best of our knowledge, this is the first attempt to adopt Conformer in the bird sound recognition task. Specifically, the proposed CDPNet mainly consists of a dual-path time-frequency joint modeling module (DPTFM) and a multi-scale feature fusion module (MSFFM). The former aims to simultaneously capture time-frequency local features, long-term time dependence, and long-term frequency dependence to better model bird sound characteristics effectively. The latter is designed to improve recognition accuracy by fusing different scales of features. The proposed algorithm is implemented on an edge computing platform, NVIDIA Jetson Nano, to build a real-time bird sound recognition monitoring system. The ablation experimental results verify the benefit of using the DPTFM and the MSFFM. Through training and testing on the Semibirdaudio dataset containing 27,155 sound clips and the public Birdsdata dataset, the proposed CDPNet outperforms the other state-of-the-art models in terms of F1-score, precision, recall, and accuracy.

Accession Number: WOS:001171681100002

ISSN: 0924-669X

eISSN: 1573-7497




关于我们
下载视频观看
联系方式
通信地址

北京市海淀区清华东路甲35号(林大北路中段) 北京912信箱 (100083)

电话

010-82304210/010-82305052(传真)

E-mail

semi@semi.ac.cn

交通地图
版权所有 中国科学院半导体研究所

备案号:京ICP备05085259-1号 京公网安备110402500052 中国科学院半导体所声明