Maximizing Recognition Reliability: TrOCR Outperforms Paddle OCR in Challenging Container Automation Environments
Authors
Centre for Telecommunication Research and Innovation (CeTRI), Fakulti Teknologi dan Kejuruteraan Elektronik dan Komputer, Universiti Teknikal Malaysia Melaka (Malaysia)
Centre for Telecommunication Research and Innovation (CeTRI), Fakulti Teknologi dan Kejuruteraan Elektronik dan Komputer, Universiti Teknikal Malaysia Melaka (Malaysia)
Faculty of Management, Multimedia University, Cyberjaya, Selangor (Malaysia)
Article Information
DOI: 10.47772/IJRISS.2025.91100629
Subject Category: Computer Science
Volume/Issue: 9/11 | Page No: 8062-8070
Publication Timeline
Submitted: 2025-12-11
Accepted: 2025-12-18
Published: 2025-12-29
Abstract
Container automation systems have become increasingly important in response to the rapid growth of global trade and the need for efficient logistics. Previous research lacked a systematic comparison of advanced OCR models (PaddleOCR and TrOCR) integrated with reliable text detection (YOLO) to determine the optimal balance of speed and high accuracy under real-world port conditions. This study developed an automated pipeline combining the YOLOv10 object detector for text region localization with fine-tuned Paddle OCR and TrOCR models for recognition. Evaluation was conducted on a test set of 173 real-world images from an actual port terminal gate deployment after training on 8,899 augmented images. YOLOv10 achieved strong detection performance, recording a mean Average Precision (mAP) of 94.7% and an average Intersection over Union (IoU) of 0.87. TrOCR consistently demonstrated superior recognition accuracy, achieving 98.73% exact match for ISO codes and 71.17% for container numbers, exceeding PaddleOCR (97.42% and 70.14%). However, PaddleOCR was significantly faster (up to 18.35 FPS for ISO codes) compared to TrOCR (7.93 FPS). The integrated YOLOv10 with TrOCR pipeline is recommended for reliable, high-precision text recognition, advancing automated port logistics through a scalable, AI-powered solution that prioritizes accuracy in challenging real-world scenarios.
Keywords
Deep Learning, Container Text Detection System, Container Text Recognition.
Downloads
References
1. Port of Rotterdam. Sustainability and Innovation in the World’s Most Automated Port. Retrieved from https://www.portofrotterdam.com/en/ourport/facts-figures/port-innovation (accessed 9 March 2025). [Google Scholar] [Crossref]
2. Yang, J. Advanced Automation at China's Major Ports: A Study of Shanghai and Qingdao. Journal of Transport and Logistics, 2023. Retrieved from https://www.journals.elsevier.com/journal-of-transportand-logistics (accessed 9 March 2025). [Google Scholar] [Crossref]
3. Australian Maritime Safety Authority. Enhancing Container Tracking with OCR Systems in Australian Ports. Infrastructure Review, 2021. Retrieved from https://www.amsa.gov.au/safety-navigation. (accessed 9 March 2025). [Google Scholar] [Crossref]
4. Australian Maritime Safety Authority. OCR Implementation for Container Recognition at Australian Ports. Maritime Technology Reports, 2021. Retrieved from https://www.amsa.gov.au/marine-technology. (accessed 10 March 2025) [Google Scholar] [Crossref]
5. S. Jie, “The world's largest single fully automated terminal Shnaghai Yangshan Port Phase IV opening,” China Economic Weekly, pp. 58–60, 2017. [Google Scholar] [Crossref]
6. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Computer Vision and Pattern Recognition (CVPR), 2016. [Google Scholar] [Crossref]
7. A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, and J. Han, “Yolov10: Real-time end-to-end object detection,” Advances in Neural Information Processing Systems, vol. 37, pp. 107984–108011, 2024. [Google Scholar] [Crossref]
8. C. Li et al., “PP-OCRv3: More attempts for the improvement of ultra lightweight OCR system,” arXiv preprint arXiv:2206.03001, 2022. [Google Scholar] [Crossref]
9. Y. Du et al., “SVTR: Scene text recognition with a single visual model,” in Proc. Int. Joint Conf. Artif. Intell. (IJCAI), 2022, pp. 888–896. [Google Scholar] [Crossref]
10. M. Li et al., “TrOCR: Transformer-based optical character recognition with pre-trained models,” in Proc. [Google Scholar] [Crossref]
11. AAAI Conf. Artif. Intell., vol. 37, no. 11, pp. 13094–13102, Jun. 2023. [Google Scholar] [Crossref]
Metrics
Views & Downloads
Similar Articles
- What the Desert Fathers Teach Data Scientists: Ancient Ascetic Principles for Ethical Machine-Learning Practice
- Comparative Analysis of Some Machine Learning Algorithms for the Classification of Ransomware
- Comparative Performance Analysis of Some Priority Queue Variants in Dijkstra’s Algorithm
- Transfer Learning in Detecting E-Assessment Malpractice from a Proctored Video Recordings.
- Dual-Modal Detection of Parkinson’s Disease: A Clinical Framework and Deep Learning Approach Using NeuroParkNet