SEA-TALK: An AI-Powered Voice Translator and Southeast Asian Dialects Recognition
Authors
College of Computing Studies, Universidad De Manila (Philippines)
College of Computing Studies, Universidad De Manila (Philippines)
College of Computing Studies, Universidad De Manila (Philippines)
College of Computing Studies, Universidad De Manila (Philippines)
College of Computing Studies, Universidad De Manila (Philippines)
Article Information
DOI: 10.51584/IJRIAS.2025.100900045
Subject Category: Artificial Intelligence
Volume/Issue: 10/9 | Page No: 448-459
Publication Timeline
Submitted: 2025-10-01
Accepted: 2025-10-07
Published: 2025-10-12
Abstract
The study develops and evaluates SEA-Talk, a mobile AI-powered voice-to-voice translator which greatly reduces language transfer difficulties across Southeast Asia. It is often this condition because different dialects, accents, and different degrees of formality often make communication poor in this language. Conducting real-time multilingual communication for Filipino migrant workers, tourists, and students alike is a goal for SEA-Talk, which combines the necessary tools-the digitization of sounds, synthetic speech, machine translation, and speech recognition. The development of the system was done based on the agile procedure that allows improvements in the design through feedback from users. Built upon a layered architecture, it is made up of an in-built translation engine, text-to-speech and speech-to-text modules, and offline capabilities via downloadable language packages. Additional important features, including context-aware translation, formality detection, a correction facility, and feedback loop for user suggestions, ensure adaptability to the linguistic and cultural diversity of Southeast Asia.
SEA survey was structured for the evaluation of the system, it comprised 100 respondents randomly selected from three different establishments to assess the system based on seven attributes of quality: functionality, performance, usability, reliability, security, maintainability, and compatibility. Results showed high marks across the board ranging from mean scores of 4.07 to 4.22 (Agree), with the biggest scores mostly given to functionality and compatibility, signifying the system can deliver the essential feature without neglecting adaptability in various devices. Reliability and sustainability would need further improvement, while usability, performance, security, and maintainability were rated high. In conclusion, SEA-Talk therefore achieves its desired target: providing a dependable, comprehensive translation platform in a Southeast Asian context. The ratings lend credence to the tool's importance in cross-cultural and linguistic communication. Important recommendations are made for improvements in offline facility, enhancement in languages covered, increased security, and sustained development for wider acceptability.
Keywords
SEA-Talk, speech recognition, machine translation, real-time translation, Southeast Asian languages, mobile application
Downloads
References
1. Agrawal, D., Vats, A., & Khan, S. (2024). Language translator tool. International Journal of Scientific Research and Engineering Trends, 10(3). https://ijsret.com/wp-content/uploads/2024/05/IJSRET_V10_issue3_125.pdf [Google Scholar] [Crossref]
2. Al-Bakhrani, A. A., Amran, G. A., Al-Hejri, A. M., Chavan, S. R., Manza, R., & Nimbhore, S. (2023). Development of multilingual speech recognition and translation technologies for communication and interaction. In Proceedings of the First International Conference on Advances in Computer Vision and Artificial Intelligence Technologies (ACVAIT 2022) (pp. 711–723). Atlantis Press. https://doi.org/10.2991/978-94-6463-196-8_54 [Google Scholar] [Crossref]
3. Baliber, R. I., Cheng, C., Adlaon, K. M., & Mamonong, V. (2020). Bridging Philippine languages with multilingual neural machine translation. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages (pp. 14–22). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.loresmt-1.2 [Google Scholar] [Crossref]
4. Dheeraj, N., Ravindra, V., & Raj, P. (2024). Multilingual speech transcription and translation system. International Journal of Advanced Research in Science, Communication and Technology, 378–394. https://doi.org/10.48175/IJARSCT-18843 [Google Scholar] [Crossref]
5. Guevara, R. C. L., Cajote, R. D., Bayona, M. G. A. R., & Lucas, C. R. G. (2024, May). Philippine Languages Database: A multilingual speech corpora for developing systems for low-resource languages. In Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 264–271). ELRA & ICCL. https://aclanthology.org/2024.sigul-1.32/ [Google Scholar] [Crossref]
6. Isra, J. (2024). Evaluating translation quality and usability of Maana: A Meranaw-English bidirectional speech translation app. In Proceedings of the 2024 3rd International Conference on Digital Transformation and Applications (ICDXA) (pp. 189–193). IEEE. https://doi.org/10.1109/ICDXA61007.2024.10470814 [Google Scholar] [Crossref]
7. Kannan, M. K. J., Polley, R., Raj, A., Nandan, K., Bharadwaj, D., & Bindal, P. (2024). Multilingual machine translation using Hugging Face models for AI-powered language translation to decode the world's voices in real-time. International Journal of All Research Education & Scientific Methods, 12, 2032–2039. https://doi.org/10.56025/IJARESM.2024.1211242032 [Google Scholar] [Crossref]
8. Kothari, N., Jain, C. P., Soni, D., Kumar, A., Dadheech, A., & Sharma, H. (2024). Instant language translation app. International Journal of Technical Research & Science, 9, 27–35. https://doi.org/10.30780/specialissue-ISET-2024/018 [Google Scholar] [Crossref]
9. Ogundokun, R., Bamidele, A., Misra, S., Segun-Owolabi, T., Adeniyi, E., & Jaglan, V. (2021). An android based language translator application. Journal of Physics: Conference Series, 1767(1), 012032. https://doi.org/10.1088/1742-6596/1767/1/012032 [Google Scholar] [Crossref]
10. Song, K., Lei, Y., Chen, P., Cao, Y., Wei, K., Zhang, Y., Xie, L., Jiang, N., & Zhao, G. (2023). The NPU-MSXF speech-to-speech translation system for IWSLT 2023 speech-to-speech translation task. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023) (pp. 311–320). Association for Computational Linguistics. [Google Scholar] [Crossref]
Metrics
Views & Downloads
Similar Articles
- The Role of Artificial Intelligence in Revolutionizing Library Services in Nairobi: Ethical Implications and Future Trends in User Interaction
- ESPYREAL: A Mobile Based Multi-Currency Identifier for Visually Impaired Individuals Using Convolutional Neural Network
- Comparative Analysis of AI-Driven IoT-Based Smart Agriculture Platforms with Blockchain-Enabled Marketplaces
- AI-Based Dish Recommender System for Reducing Fruit Waste through Spoilage Detection and Ripeness Assessment
- The Ethics of AI in Financial Planning: Bias, Transparency, and the Role of Human Judgment