Stereo Matching Frameworks for Depth-Aware Object Detection: A Comprehensive Review
Authors
Ken Prameswari Caesarella Aryaputri
Universiti Teknikal Malaysia Melaka Durian Tunggal (Malaysia)
Universiti Teknikal Malaysia Melaka, Durian Tunggal (Malaysia)
Universiti Teknikal Malaysia Melaka, Durian Tunggal (Malaysia)
Universiti Teknikal Malaysia Melaka, Durian Tunggal (Malaysia)
Universiti Malaysia Pahang Al-Sultan Abdullah (Malaysia)
IT Support Department (Malaysia)
Article Information
DOI: 10.47772/IJRISS.2025.91200127
Subject Category: Computer Science and Smart Tourism
Volume/Issue: 9/12 | Page No: 1704-1715
Publication Timeline
Submitted: 2025-12-13
Accepted: 2025-12-20
Published: 2026-01-03
Abstract
Stereo matching is a fundamental technique for estimating depth from stereo image pairs, and it remains essential for object detection tasks that require accurate three-dimensional perception. This review examines classical, semi-global, and deep learning stereo frameworks, emphasizing their operational principles, strengths, and limitations. The study highlights the importance of disparity reliability for real-world applications in autonomous driving, robotics, medical imaging, agriculture, and remote sensing. Key challenges are identified, including texture ambiguity, occlusion, illumination variation, repetitive patterns, and computational burden, all of which influence the performance of stereo-based detection systems. Insights from recent literature show that advances in adaptive aggregation, transformer-based models, temporal fusion, and multi-sensor integration have improved depth stability and detection accuracy across complex environments. This review provides a consolidated understanding of stereo matching developments and outlines opportunities for designing robust, efficient, and application-aware stereo frameworks for next-generation object detectio.
Keywords
Stereo Matching; Disparity Estimation; Depth Perception
Downloads
References
1. N. Amreen, T. Abid, and Z. Abid, “A New Stereo Matching Function by a Hybrid Convolutional Neural Network,” International Journal of Scientific Methods in Intelligence Engineering Networks, vol. 01, no. 07, pp. 31–39, 2023, doi: 10.58599/IJSMIEN.2023.1704. [Google Scholar] [Crossref]
2. S. Fan, W. Sun, J. Zheng, Q. Fu, M. Xue, and W. Wu, “Accurate edge-preserving stereo matching by enhancing anisotropy,” Signal Process Image Commun, vol. 114, p. 116945, May 2023, doi: 10.1016/j.image.2023.116945. [Google Scholar] [Crossref]
3. J. Zhang, P. Li, X. Wang, and Y. Zhao, “Hierarchical Feature Fusion and Multi-scale Cost Aggregation for Stereo Matching,” in 2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2022, pp. 126–131. doi: 10.1109/CCET55412.2022.9906319. [Google Scholar] [Crossref]
4. T. Guan, C. Wang, and Y.-H. Liu, “Neural Markov Random Field for Stereo Matching,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2024, pp. 5459–5469. doi: 10.1109/CVPR52733.2024.00522. [Google Scholar] [Crossref]
5. Z. Liang and C. Li, “Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 4, pp. 3333–3341, Mar. 2024, doi: 10.1609/aaai.v38i4.28119. [Google Scholar] [Crossref]
6. Y. Li, G. Fu, Y. Gao, L. Wang, and W. Wang, “Dynamic Planning Stereo Matching Study Based on Improved SGBM Algorithm,” in 2024 IEEE 7th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2024, pp. 15–19. doi: 10.1109/CCET62233.2024.10837844. [Google Scholar] [Crossref]
7. M. Feng, J. Cheng, H. Jia, L. Liu, G. Xu, and X. Yang, “MC-Stereo: Multi-Peak Lookup and Cascade Search Range for Stereo Matching,” in 2024 International Conference on 3D Vision (3DV), IEEE, Mar. 2024, pp. 344–353. doi: 10.1109/3DV62453.2024.00083. [Google Scholar] [Crossref]
8. Y. Xie, S. Zheng, and W. Li, “Feature-Guided Spatial Attention Upsampling for Real-Time Stereo Matching Network,” IEEE MultiMedia, vol. 28, no. 1, pp. 38–47, Jan. 2021, doi: 10.1109/MMUL.2020.3030027. [Google Scholar] [Crossref]
9. S. Fan, W. Sun, J. Zheng, Q. Fu, M. Xue, and W. Wu, “Accurate edge-preserving stereo matching by enhancing anisotropy,” Signal Process Image Commun, vol. 114, p. 116945, May 2023, doi: 10.1016/j.image.2023.116945. [Google Scholar] [Crossref]
10. J. Zhang, P. Li, X. Wang, and Y. Zhao, “Hierarchical Feature Fusion and Multi-scale Cost Aggregation for Stereo Matching,” in 2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2022, pp. 126–131. doi: 10.1109/CCET55412.2022.9906319. [Google Scholar] [Crossref]
11. Y. Li, G. Fu, Y. Gao, L. Wang, and W. Wang, “Dynamic Planning Stereo Matching Study Based on Improved SGBM Algorithm,” in 2024 IEEE 7th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2024, pp. 15–19. doi: 10.1109/CCET62233.2024.10837844. [Google Scholar] [Crossref]
12. Z. Liang and C. Li, “Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 4, pp. 3333–3341, Mar. 2024, doi: 10.1609/aaai.v38i4.28119. [Google Scholar] [Crossref]
13. X. Xia, S. Dai, H. Qi, Z. Xu, S. Wang, and M. Zhang, “Research on Object Measurement Based on 3D Stereo Vision,” in 2021 33rd Chinese Control and Decision Conference (CCDC), IEEE, May 2021, pp. 7260–7264. doi: 10.1109/CCDC52312.2021.9602239. [Google Scholar] [Crossref]
14. H. Ghahremannezhad, H. Shi, and chengjun liu, “Object Detection in Traffic Videos: A Survey,” Aug. 23, 2022. doi: 10.36227/techrxiv.20477685.v1. [Google Scholar] [Crossref]
15. N. Amreen, T. Abid, and Z. Abid, “A New Stereo Matching Function by a Hybrid Convolutional Neural Network,” International Journal of Scientific Methods in Intelligence Engineering Networks, vol. 01, no. 07, pp. 31–39, 2023, doi: 10.58599/IJSMIEN.2023.1704. [Google Scholar] [Crossref]
16. S. Fan, W. Sun, J. Zheng, Q. Fu, M. Xue, and W. Wu, “Accurate edge-preserving stereo matching by enhancing anisotropy,” Signal Process Image Commun, vol. 114, p. 116945, May 2023, doi: 10.1016/j.image.2023.116945. [Google Scholar] [Crossref]
17. J. Chen and Y. Xia, “Robust stereo matching using improved ZNCC combined SAD-LBP,” in Proceedings of the 5th International Conference on Multimedia and Image Processing, New York, NY, USA: ACM, Jan. 2020, pp. 141–146. doi: 10.1145/3381271.3381295. [Google Scholar] [Crossref]
18. J. Zhang, P. Li, X. Wang, and Y. Zhao, “Hierarchical Feature Fusion and Multi-scale Cost Aggregation for Stereo Matching,” in 2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2022, pp. 126–131. doi: 10.1109/CCET55412.2022.9906319. [Google Scholar] [Crossref]
19. Y. Xie, S. Zheng, and W. Li, “Feature-Guided Spatial Attention Upsampling for Real-Time Stereo Matching Network,” IEEE MultiMedia, vol. 28, no. 1, pp. 38–47, Jan. 2021, doi: 10.1109/MMUL.2020.3030027. [Google Scholar] [Crossref]
20. Y. Li, G. Fu, Y. Gao, L. Wang, and W. Wang, “Dynamic Planning Stereo Matching Study Based on Improved SGBM Algorithm,” in 2024 IEEE 7th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2024, pp. 15–19. doi: 10.1109/CCET62233.2024.10837844. [Google Scholar] [Crossref]
21. Z. Liang and C. Li, “Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 4, pp. 3333–3341, Mar. 2024, doi: 10.1609/aaai.v38i4.28119. [Google Scholar] [Crossref]
22. T. Guan, C. Wang, and Y.-H. Liu, “Neural Markov Random Field for Stereo Matching,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2024, pp. 5459–5469. doi: 10.1109/CVPR52733.2024.00522. [Google Scholar] [Crossref]
23. M. Feng, J. Cheng, H. Jia, L. Liu, G. Xu, and X. Yang, “MC-Stereo: Multi-Peak Lookup and Cascade Search Range for Stereo Matching,” in 2024 International Conference on 3D Vision (3DV), IEEE, Mar. 2024, pp. 344–353. doi: 10.1109/3DV62453.2024.00083. [Google Scholar] [Crossref]
24. Y. Li et al., “EGOF-Net: epipolar guided optical flow network for unrectified stereo matching,” Opt Express, vol. 29, no. 21, p. 33874, Oct. 2021, doi: 10.1364/OE.440241. [Google Scholar] [Crossref]
25. J. Y. Lee, W. Ka, J. Choi, and J. Kim, “Modeling Stereo-Confidence out of the End-to-End Stereo-Matching Network via Disparity Plane Sweep,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 4, pp. 2901–2910, Mar. 2024, doi: 10.1609/aaai.v38i4.28071. [Google Scholar] [Crossref]
26. H. Ghahremannezhad, H. Shi, and chengjun liu, “Object Detection in Traffic Videos: A Survey,” Aug. 23, 2022. doi: 10.36227/techrxiv.20477685. [Google Scholar] [Crossref]
27. Y. H. Feng, R. H. Zhang, and S. Zhai, “Road elevation map estimation based on affine transformation and stereo matching,” J Phys Conf Ser, vol. 1601, no. 6, p. 062015, Aug. 2020, doi: 10.1088/1742-6596/1601/6/062015. [Google Scholar] [Crossref]
28. C. Ji, G. Liu, and D. Zhao, “ETS-3D: An Efficient Two-Stage Framework for Stereo 3D Object Detection,” J Vis Commun Image Represent, vol. 88, p. 103634, Oct. 2022, doi: 10.1016/j.jvcir.2022.103634. [Google Scholar] [Crossref]
29. J. Shi et al., “ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera,” in 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, May 2024, pp. 5441–5447. doi: 10.1109/ICRA57147.2024.10611152. [Google Scholar] [Crossref]
30. X. Xia, S. Dai, H. Qi, Z. Xu, S. Wang, and M. Zhang, “Research on Object Measurement Based on 3D Stereo Vision,” in 2021 33rd Chinese Control and Decision Conference (CCDC), IEEE, May 2021, pp. 7260–7264. doi: 10.1109/CCDC52312.2021.9602239. [Google Scholar] [Crossref]
31. S. Fan, W. Sun, J. Zheng, Q. Fu, M. Xue, and W. Wu, “Accurate edge-preserving stereo matching by enhancing anisotropy,” Signal Process Image Commun, vol. 114, p. 116945, May 2023, doi: 10.1016/j.image.2023.116945. [Google Scholar] [Crossref]
32. H.-Y. Huang and Z.-H. Liu, “A Disparity Refinement in Stereo Matching based on Mean-shift Segmentation and Spatiotemporal Domain,” Journal of Imaging Science and Technology, vol. 64, no. 2, pp. 020505-1-020505–12, Mar. 2020, doi: 10.2352/J.ImagingSci.Technol.2020.64.2.020505. [Google Scholar] [Crossref]
33. D. Zhang, C. Wang, and Q. Fu, “A new benchmark for camouflaged object detection: RGB-D camouflaged object detection dataset,” Open Physics, vol. 22, no. 1, Jul. 2024, doi: 10.1515/phys-2024-0060. [Google Scholar] [Crossref]
34. S. Fan, W. Sun, J. Zheng, Q. Fu, M. Xue, and W. Wu, “Accurate edge-preserving stereo matching by enhancing anisotropy,” Signal Process Image Commun, vol. 114, p. 116945, May 2023, doi: 10.1016/j.image.2023.116945. [Google Scholar] [Crossref]
35. Y. H. Feng, R. H. Zhang, and S. Zhai, “Road elevation map estimation based on affine transformation and stereo matching,” J Phys Conf Ser, vol. 1601, no. 6, p. 062015, Aug. 2020, doi: 10.1088/1742-6596/1601/6/062015. [Google Scholar] [Crossref]
36. S. Moitra and S. Biswas, “Object Detection in Images: A Survey,” International Journal of Science and Research (IJSR), vol. 12, no. 4, pp. 10–29, Apr. 2023, doi: 10.21275/SR23330184650. [Google Scholar] [Crossref]
37. Y. Massoud and R. Laganiere, “Learnable Fusion Mechanisms for Object Detection in Autonomous Vehicles,” Nov. 10, 2022. doi: 10.36227/techrxiv.21506124. [Google Scholar] [Crossref]
38. T. Guan, C. Wang, and Y.-H. Liu, “Neural Markov Random Field for Stereo Matching,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2024, pp. 5459–5469. doi: 10.1109/CVPR52733.2024.00522. [Google Scholar] [Crossref]
39. J. Zhang, P. Li, X. Wang, and Y. Zhao, “Hierarchical Feature Fusion and Multi-scale Cost Aggregation for Stereo Matching,” in 2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2022, pp. 126–131. doi: 10.1109/CCET55412.2022.9906319. [Google Scholar] [Crossref]
40. W. Cui and C. Cheng, “Deep High-order Tensor Convolutional Sparse Coding for Stereo Matching,” in 2021 3rd International Conference on Robotics and Computer Vision (ICRCV), IEEE, Aug. 2021, pp. 57–62. doi: 10.1109/ICRCV52986.2021.9546963. [Google Scholar] [Crossref]
41. Z. Liang and C. Li, “Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 4, pp. 3333–3341, Mar. 2024, doi: 10.1609/aaai.v38i4.28119. [Google Scholar] [Crossref]
42. Y. Li, G. Fu, Y. Gao, L. Wang, and W. Wang, “Dynamic Planning Stereo Matching Study Based on Improved SGBM Algorithm,” in 2024 IEEE 7th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2024, pp. 15–19. doi: 10.1109/CCET62233.2024.10837844. [Google Scholar] [Crossref]
43. C. Ji, G. Liu, and D. Zhao, “ETS-3D: An Efficient Two-Stage Framework for Stereo 3D Object Detection,” J Vis Commun Image Represent, vol. 88, p. 103634, Oct. 2022, doi: 10.1016/j.jvcir.2022.103634. [Google Scholar] [Crossref]
44. M. Li, C. Liu, X. Pan, and Z. Li, “Digital Twin-Assisted Graph Matching Multi-Task Object Detection Method in Complex Traffic Scenarios,” Oct. 25, 2024. doi: 10.21203/rs.3.rs-5237898/v1. [Google Scholar] [Crossref]
45. K. Su, W. Yan, X. Wei, and M. Gu, “Stereo VoVNet-CNN for 3D object detection,” Multimed Tools Appl, vol. 81, no. 25, pp. 35803–35813, Oct. 2022, doi: 10.1007/s11042-021-11506-7. [Google Scholar] [Crossref]
46. N. Amreen, T. Abid, and Z. Abid, “A New Stereo Matching Function by a Hybrid Convolutional Neural Network,” International Journal of Scientific Methods in Intelligence Engineering Networks, vol. 01, no. 07, pp. 31–39, 2023, doi: 10.58599/IJSMIEN.2023.1704. [Google Scholar] [Crossref]
47. Y. Massoud and R. Laganiere, “Learnable Fusion Mechanisms for Object Detection in Autonomous Vehicles,” Nov. 10, 2022. doi: 10.36227/techrxiv.21506124.v1. [Google Scholar] [Crossref]
48. H. Ghahremannezhad, H. Shi, and chengjun liu, “Object Detection in Traffic Videos: A Survey,” Aug. 23, 2022. doi: 10.36227/techrxiv.20477685.v1. [Google Scholar] [Crossref]
49. X. Xia, S. Dai, H. Qi, Z. Xu, S. Wang, and M. Zhang, “Research on Object Measurement Based on 3D Stereo Vision,” in 2021 33rd Chinese Control and Decision Conference (CCDC), IEEE, May 2021, pp. 7260–7264. doi: 10.1109/CCDC52312.2021.9602239. [Google Scholar] [Crossref]
50. S. Fan, W. Sun, J. Zheng, Q. Fu, M. Xue, and W. Wu, “Accurate edge-preserving stereo matching by enhancing anisotropy,” Signal Process Image Commun, vol. 114, p. 116945, May 2023, doi: 10.1016/j.image.2023.116945. [Google Scholar] [Crossref]
51. J. Zhang, P. Li, X. Wang, and Y. Zhao, “Hierarchical Feature Fusion and Multi-scale Cost Aggregation for Stereo Matching,” in 2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2022, pp. 126–131. doi: 10.1109/CCET55412.2022.9906319. [Google Scholar] [Crossref]
52. W. Cui and C. Cheng, “Deep High-order Tensor Convolutional Sparse Coding for Stereo Matching,” in 2021 3rd International Conference on Robotics and Computer Vision (ICRCV), IEEE, Aug. 2021, pp. 57–62. doi: 10.1109/ICRCV52986.2021.9546963. [Google Scholar] [Crossref]
53. Z. Liang and C. Li, “Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 4, pp. 3333–3341, Mar. 2024, doi: 10.1609/aaai.v38i4.28119. [Google Scholar] [Crossref]
54. Y. Li, G. Fu, Y. Gao, L. Wang, and W. Wang, “Dynamic Planning Stereo Matching Study Based on Improved SGBM Algorithm,” in 2024 IEEE 7th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2024, pp. 15–19. doi: 10.1109/CCET62233.2024.10837844. [Google Scholar] [Crossref]
55. M. Li, C. Liu, X. Pan, and Z. Li, “Digital Twin-Assisted Graph Matching Multi-Task Object Detection Method in Complex Traffic Scenarios,” Oct. 25, 2024. doi: 10.21203/rs.3.rs-5237898/v1. [Google Scholar] [Crossref]
56. D. Zhang, C. Wang, and Q. Fu, “A new benchmark for camouflaged object detection: RGB-D camouflaged object detection dataset,” Open Physics, vol. 22, no. 1, Jul. 2024, doi: 10.1515/phys-2024-0060. [Google Scholar] [Crossref]
Metrics
Views & Downloads
Similar Articles
- Travaalay: An AI-Powered Mobile Platform for Tourism with Student Translator Guides, Agro-Tourism, and Astro-Tourism Experiences
- Integrating QVoC (QR Code with Voice Content) to Enhance Medication Adherence for Geriatric Diabetic Patients
- "Navigating Global Volatility: Assessing the Resilience and Innovation of Bahrain’s Financial Sector Through the 2025 Financial Stability Report"
- Comparison of Similarity Distance-Based Metrics for HODA and BANGLA Dataset for Enhanced Precision
- A Deep Learning-Based Framework for Early Diabetes Prediction Using Retinal Fundus Images