SEA-TALK: An AI-Powered Voice Translator and Southeast Asian Dialects Recognition

Authors

Sales, Gerome P.

College of Computing Studies, Universidad De Manila (Philippines)

Rapadas, Carl Angelo L.

College of Computing Studies, Universidad De Manila (Philippines)

Salazar, Dexter Josh S.

College of Computing Studies, Universidad De Manila (Philippines)

Segundo, Tristan.

College of Computing Studies, Universidad De Manila (Philippines)

Fernandez, Ronald

College of Computing Studies, Universidad De Manila (Philippines)

Article Information

DOI: 10.51584/IJRIAS.2025.100900045

Subject Category: Artificial Intelligence

Volume/Issue: 10/9 | Page No: 448-459

Publication Timeline

Submitted: 2025-10-01

Accepted: 2025-10-07

Published: 2025-10-12

Abstract

The study develops and evaluates SEA-Talk, a mobile AI-powered voice-to-voice translator which greatly reduces language transfer difficulties across Southeast Asia. It is often this condition because different dialects, accents, and different degrees of formality often make communication poor in this language. Conducting real-time multilingual communication for Filipino migrant workers, tourists, and students alike is a goal for SEA-Talk, which combines the necessary tools-the digitization of sounds, synthetic speech, machine translation, and speech recognition. The development of the system was done based on the agile procedure that allows improvements in the design through feedback from users. Built upon a layered architecture, it is made up of an in-built translation engine, text-to-speech and speech-to-text modules, and offline capabilities via downloadable language packages. Additional important features, including context-aware translation, formality detection, a correction facility, and feedback loop for user suggestions, ensure adaptability to the linguistic and cultural diversity of Southeast Asia.
SEA survey was structured for the evaluation of the system, it comprised 100 respondents randomly selected from three different establishments to assess the system based on seven attributes of quality: functionality, performance, usability, reliability, security, maintainability, and compatibility. Results showed high marks across the board ranging from mean scores of 4.07 to 4.22 (Agree), with the biggest scores mostly given to functionality and compatibility, signifying the system can deliver the essential feature without neglecting adaptability in various devices. Reliability and sustainability would need further improvement, while usability, performance, security, and maintainability were rated high. In conclusion, SEA-Talk therefore achieves its desired target: providing a dependable, comprehensive translation platform in a Southeast Asian context. The ratings lend credence to the tool's importance in cross-cultural and linguistic communication. Important recommendations are made for improvements in offline facility, enhancement in languages covered, increased security, and sustained development for wider acceptability.

Keywords

SEA-Talk, speech recognition, machine translation, real-time translation, Southeast Asian languages, mobile application

Downloads

References

1. Agrawal, D., Vats, A., & Khan, S. (2024). Language translator tool. International Journal of Scientific Research and Engineering Trends, 10(3). https://ijsret.com/wp-content/uploads/2024/05/IJSRET_V10_issue3_125.pdf [Google Scholar] [Crossref]

2. Al-Bakhrani, A. A., Amran, G. A., Al-Hejri, A. M., Chavan, S. R., Manza, R., & Nimbhore, S. (2023). Development of multilingual speech recognition and translation technologies for communication and interaction. In Proceedings of the First International Conference on Advances in Computer Vision and Artificial Intelligence Technologies (ACVAIT 2022) (pp. 711–723). Atlantis Press. https://doi.org/10.2991/978-94-6463-196-8_54 [Google Scholar] [Crossref]

3. Baliber, R. I., Cheng, C., Adlaon, K. M., & Mamonong, V. (2020). Bridging Philippine languages with multilingual neural machine translation. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages (pp. 14–22). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.loresmt-1.2 [Google Scholar] [Crossref]

4. Dheeraj, N., Ravindra, V., & Raj, P. (2024). Multilingual speech transcription and translation system. International Journal of Advanced Research in Science, Communication and Technology, 378–394. https://doi.org/10.48175/IJARSCT-18843 [Google Scholar] [Crossref]

5. Guevara, R. C. L., Cajote, R. D., Bayona, M. G. A. R., & Lucas, C. R. G. (2024, May). Philippine Languages Database: A multilingual speech corpora for developing systems for low-resource languages. In Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 264–271). ELRA & ICCL. https://aclanthology.org/2024.sigul-1.32/ [Google Scholar] [Crossref]

6. Isra, J. (2024). Evaluating translation quality and usability of Maana: A Meranaw-English bidirectional speech translation app. In Proceedings of the 2024 3rd International Conference on Digital Transformation and Applications (ICDXA) (pp. 189–193). IEEE. https://doi.org/10.1109/ICDXA61007.2024.10470814 [Google Scholar] [Crossref]

7. Kannan, M. K. J., Polley, R., Raj, A., Nandan, K., Bharadwaj, D., & Bindal, P. (2024). Multilingual machine translation using Hugging Face models for AI-powered language translation to decode the world's voices in real-time. International Journal of All Research Education & Scientific Methods, 12, 2032–2039. https://doi.org/10.56025/IJARESM.2024.1211242032 [Google Scholar] [Crossref]

8. Kothari, N., Jain, C. P., Soni, D., Kumar, A., Dadheech, A., & Sharma, H. (2024). Instant language translation app. International Journal of Technical Research & Science, 9, 27–35. https://doi.org/10.30780/specialissue-ISET-2024/018 [Google Scholar] [Crossref]

9. Ogundokun, R., Bamidele, A., Misra, S., Segun-Owolabi, T., Adeniyi, E., & Jaglan, V. (2021). An android based language translator application. Journal of Physics: Conference Series, 1767(1), 012032. https://doi.org/10.1088/1742-6596/1767/1/012032 [Google Scholar] [Crossref]

10. Song, K., Lei, Y., Chen, P., Cao, Y., Wei, K., Zhang, Y., Xie, L., Jiang, N., & Zhao, G. (2023). The NPU-MSXF speech-to-speech translation system for IWSLT 2023 speech-to-speech translation task. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023) (pp. 311–320). Association for Computational Linguistics. [Google Scholar] [Crossref]

Metrics

Views & Downloads

Similar Articles