Maximizing Recognition Reliability: TrOCR Outperforms Paddle OCR in Challenging Container Automation Environments

Authors

Jia Qing Cheok

Centre for Telecommunication Research and Innovation (CeTRI), Fakulti Teknologi dan Kejuruteraan Elektronik dan Komputer, Universiti Teknikal Malaysia Melaka (Malaysia)

Kim Chuan Lim

Centre for Telecommunication Research and Innovation (CeTRI), Fakulti Teknologi dan Kejuruteraan Elektronik dan Komputer, Universiti Teknikal Malaysia Melaka (Malaysia)

Chong En Si

Faculty of Management, Multimedia University, Cyberjaya, Selangor (Malaysia)

Article Information

DOI: 10.47772/IJRISS.2025.91100629

Subject Category: Computer Science

Volume/Issue: 9/11 | Page No: 8062-8070

Publication Timeline

Submitted: 2025-12-11

Accepted: 2025-12-18

Published: 2025-12-29

Abstract

Container automation systems have become increasingly important in response to the rapid growth of global trade and the need for efficient logistics. Previous research lacked a systematic comparison of advanced OCR models (PaddleOCR and TrOCR) integrated with reliable text detection (YOLO) to determine the optimal balance of speed and high accuracy under real-world port conditions. This study developed an automated pipeline combining the YOLOv10 object detector for text region localization with fine-tuned Paddle OCR and TrOCR models for recognition. Evaluation was conducted on a test set of 173 real-world images from an actual port terminal gate deployment after training on 8,899 augmented images. YOLOv10 achieved strong detection performance, recording a mean Average Precision (mAP) of 94.7% and an average Intersection over Union (IoU) of 0.87. TrOCR consistently demonstrated superior recognition accuracy, achieving 98.73% exact match for ISO codes and 71.17% for container numbers, exceeding PaddleOCR (97.42% and 70.14%). However, PaddleOCR was significantly faster (up to 18.35 FPS for ISO codes) compared to TrOCR (7.93 FPS). The integrated YOLOv10 with TrOCR pipeline is recommended for reliable, high-precision text recognition, advancing automated port logistics through a scalable, AI-powered solution that prioritizes accuracy in challenging real-world scenarios.

Keywords

Deep Learning, Container Text Detection System, Container Text Recognition.

Downloads

References

1. Port of Rotterdam. Sustainability and Innovation in the World’s Most Automated Port. Retrieved from https://www.portofrotterdam.com/en/ourport/facts-figures/port-innovation (accessed 9 March 2025). [Google Scholar] [Crossref]

2. Yang, J. Advanced Automation at China's Major Ports: A Study of Shanghai and Qingdao. Journal of Transport and Logistics, 2023. Retrieved from https://www.journals.elsevier.com/journal-of-transportand-logistics (accessed 9 March 2025). [Google Scholar] [Crossref]

3. Australian Maritime Safety Authority. Enhancing Container Tracking with OCR Systems in Australian Ports. Infrastructure Review, 2021. Retrieved from https://www.amsa.gov.au/safety-navigation. (accessed 9 March 2025). [Google Scholar] [Crossref]

4. Australian Maritime Safety Authority. OCR Implementation for Container Recognition at Australian Ports. Maritime Technology Reports, 2021. Retrieved from https://www.amsa.gov.au/marine-technology. (accessed 10 March 2025) [Google Scholar] [Crossref]

5. S. Jie, “The world's largest single fully automated terminal Shnaghai Yangshan Port Phase IV opening,” China Economic Weekly, pp. 58–60, 2017. [Google Scholar] [Crossref]

6. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Computer Vision and Pattern Recognition (CVPR), 2016. [Google Scholar] [Crossref]

7. A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, and J. Han, “Yolov10: Real-time end-to-end object detection,” Advances in Neural Information Processing Systems, vol. 37, pp. 107984–108011, 2024. [Google Scholar] [Crossref]

8. C. Li et al., “PP-OCRv3: More attempts for the improvement of ultra lightweight OCR system,” arXiv preprint arXiv:2206.03001, 2022. [Google Scholar] [Crossref]

9. Y. Du et al., “SVTR: Scene text recognition with a single visual model,” in Proc. Int. Joint Conf. Artif. Intell. (IJCAI), 2022, pp. 888–896. [Google Scholar] [Crossref]

10. M. Li et al., “TrOCR: Transformer-based optical character recognition with pre-trained models,” in Proc. [Google Scholar] [Crossref]

11. AAAI Conf. Artif. Intell., vol. 37, no. 11, pp. 13094–13102, Jun. 2023. [Google Scholar] [Crossref]

Metrics

Views & Downloads

Similar Articles