Multimodal Deep Learning Based Wildlife Intrusion Perception Using YOLOv12 and YAMNet
Authors
Student, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)
Student, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)
Student, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)
Assistant Professor, HOD, Department of Computer Science and Engineering, Rajiv Gandhi University of Knowledge Technologies, Basar (India)
Article Information
DOI: 10.51244/IJRSI.2026.1304000040
Subject Category: Computer Science
Volume/Issue: 13/4 | Page No: 436-445
Publication Timeline
Submitted: 2026-03-20
Accepted: 2026-03-26
Published: 2026-04-27
Abstract
Crop damage caused by wildlife intrusion is a major challenge for farmers near forest boundaries. Traditional monitoring methods are labor-intensive and ineffective under poor visibility conditions. This paper proposes a multi-modal wildlife intrusion detection system that combines visual object detection and environmental sound classification.
The system utilizes the YOLOv12 model for real-time animal detection from surveillance video and YAMNet for identifying animal sounds. By integrating visual and auditory sensing, the proposed framework improves detection reliability in low-light or occluded conditions. Experimental evaluation demonstrates improved detection accuracy compared to single-modal approaches. The system can be deployed on edge devices such as Raspberry Pi or Jetson Nano, enabling real-time monitoring of agricultural fields.
Keywords
Wildlife Intrusion Detection, Deep Learning
Downloads
References
1. Kishore, M. N., Mahesh Babu, B., & M. D. (2025). Real-time wild animal detection and classification using deep learning for human–wildlife conflict mitigation. International Journal of Research Publication and Reviews (IJRPR). [Google Scholar] [Crossref]
2. Gnanasekar, O., Dinesh, P., & S. K. (2024). Image processing based animal intrusion detection system in agricultural field using deep learning. In Proceedings of IEEE Conference. [Google Scholar] [Crossref]
3. Delwar, T. S., & Mukhopadhyay, S. (2025). Real-time farm surveillance using IoT and YOLOv8 for animal intrusion detection. MDPI. [Google Scholar] [Crossref]
4. Malmberg, C. (2021). Real-time audio classification on an edge device using YAMNet and TensorFlow Lite [Online]. [Google Scholar] [Crossref]
5. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 779–788). [Google Scholar] [Crossref]
6. Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). YOLOv4: Optimal speed and accuracy of object detection [Online]. https://arxiv.org/abs/2004.10934 [Google Scholar] [Crossref]
7. TensorFlow Hub. (2025). YAMNet: Audio event classification [Online]. Available at: https://tfhub.dev/google/yamnet/ (Accessed: Oct. 2025). [Google Scholar] [Crossref]
8. Roboflow. (2025). Roboflow: The universal dataset platform for computer vision [Online]. Available at: https://roboflow.com/ (Accessed: Oct. 2025). [Google Scholar] [Crossref]
Metrics
Views & Downloads
Similar Articles
- What the Desert Fathers Teach Data Scientists: Ancient Ascetic Principles for Ethical Machine-Learning Practice
- Comparative Analysis of Some Machine Learning Algorithms for the Classification of Ransomware
- Comparative Performance Analysis of Some Priority Queue Variants in Dijkstra’s Algorithm
- Transfer Learning in Detecting E-Assessment Malpractice from a Proctored Video Recordings.
- Dual-Modal Detection of Parkinson’s Disease: A Clinical Framework and Deep Learning Approach Using NeuroParkNet