Multimodal Deep Learning Based Wildlife Intrusion Perception Using YOLOv12 and YAMNet

Authors

Vamshi Krishna Velpula

Student, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)

Arun Kumar Ankeshwarapu

Student, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)

Madhu Kumar Bolle

Student, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)

Dr. B. Venkat Raman

Assistant Professor, HOD, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)

Article Information

DOI: 10.51244/IJRSI.2026.1304000040

Subject Category: Computer Science

Volume/Issue: 13/4 | Page No: 436-445

Publication Timeline

Submitted: 2026-03-20

Accepted: 2026-03-26

Published: 2026-04-27

Abstract

Crop damage caused by wildlife intrusion is a major challenge for farmers near forest boundaries. Traditional monitoring methods are labor-intensive and ineffective under poor visibility conditions. This paper proposes a multi-modal wildlife intrusion detection system that combines visual object detection and environmental sound classification.
The system utilizes the YOLOv12 model for real-time animal detection from surveillance video and YAMNet for identifying animal sounds. By integrating visual and auditory sensing, the proposed framework improves detection reliability in low-light or occluded conditions. Experimental evaluation demonstrates improved detection accuracy compared to single-modal approaches. The system can be deployed on edge devices such as Raspberry Pi or Jetson Nano, enabling real-time monitoring of agricultural fields.

Keywords

Wildlife Intrusion Detection, Deep Learning

Downloads

References

1. Kishore, M. N., Mahesh Babu, B., & M. D. (2025). Real-time wild animal detection and classification using deep learning for human–wildlife conflict mitigation. International Journal of Research Publication and Reviews (IJRPR). [Google Scholar] [Crossref]

2. Gnanasekar, O., Dinesh, P., & S. K. (2024). Image processing based animal intrusion detection system in agricultural field using deep learning. In Proceedings of IEEE Conference. [Google Scholar] [Crossref]

3. Delwar, T. S., & Mukhopadhyay, S. (2025). Real-time farm surveillance using IoT and YOLOv8 for animal intrusion detection. MDPI. [Google Scholar] [Crossref]

4. Malmberg, C. (2021). Real-time audio classification on an edge device using YAMNet and TensorFlow Lite [Online]. [Google Scholar] [Crossref]

5. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 779–788). [Google Scholar] [Crossref]

6. Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). YOLOv4: Optimal speed and accuracy of object detection [Online]. https://arxiv.org/abs/2004.10934 [Google Scholar] [Crossref]

7. TensorFlow Hub. (2025). YAMNet: Audio event classification [Online]. Available at: https://tfhub.dev/google/yamnet/ (Accessed: Oct. 2025). [Google Scholar] [Crossref]

8. Roboflow. (2025). Roboflow: The universal dataset platform for computer vision [Online]. Available at: https://roboflow.com/ (Accessed: Oct. 2025). [Google Scholar] [Crossref]

Metrics

Views & Downloads

Similar Articles