ChatGPT in Undergraduate English Language Majors: Benefits and Challenges for Writing and Speaking Proficiency

Authors

Nguyen Song Thao Anh

Student of the High-Quality English Language Program, Cohort 48, School of Foreign Languages, Can Tho University (Vietnam)

Nguyen Thi Kim Ngoc

Student of the High-Quality English Language Program, Cohort 48, School of Foreign Languages, Can Tho University (Vietnam)

Tran Khanh Bang

Student of the High-Quality English Language Program, Cohort 48, School of Foreign Languages, Can Tho University (Vietnam)

Huynh Thi Kim Ngan

Student of the High-Quality English Language Program, Cohort 48, School of Foreign Languages, Can Tho University (Vietnam)

Phan Nguyen Yen Quyen

Student of the High-Quality English Language Program, Cohort 48, School of Foreign Languages, Can Tho University (Vietnam)

Nguyen Thi Phuong Hong

School of Foreign Languages, Can Tho University (Vietnam)

Article Information

DOI: 10.47772/IJRISS.2025.910000364

Subject Category: Social science

Volume/Issue: 9/10 | Page No: 4408-4422

Publication Timeline

Submitted: 2025-10-20

Accepted: 2025-10-27

Published: 2025-11-12

Abstract

Large language models such as ChatGPT are reshaping undergraduate English language education. This systematic review synthesizes benefits and challenges for developing writing and speaking proficiency among English language majors. Following PRISMA-2020/PRISMA-S and SPAR-4-SLR, we searched Scopus, Web of Science, ERIC, EBSCO and SSRN, and analyzed 31 of the 708 empirical and review papers published. Findings converge on writing gains when ChatGPT is embedded in scaffolded, human-in-the-loop workflows that emphasize pre-writing, drafting, and revision. Effects are strongest for accuracy, coherence, and argument quality, and when teacher or peer moderation, prompt scaffolds, and transparent rubrics are present. For speaking, learners benefit through low-stakes practice, rehearsal and anxiety reduction; however, robust measurement lags behind, with scarce CEFR-aligned rubrics, limited voice-mode instrumentation, and few validated acoustic indicators. Integrity and equity remain central. Text-only AI detectors are brittle and sometimes unfair to non-native writers. Institutions should pivot to process-anchored assessment - combining prompt logs, version histories, and brief viva voce - to evidence authorship while preserving learning value. A forward research agenda is proposed, including a minimum reporting toolkit for speaking measurement (CEFR, ASR features, and ICC/κ) and multi-site randomized trials.

Keywords

assessment integrity, ChatGPT

Downloads

References

1. Aromataris, E., Lockwood, C., Porritt, K., Pilla, B., & Jordan, Z. (Eds.). (2024). JBI manual for evidence synthesis. JBI. https://jbi-global-wiki.refined.site/space/MANUAL [Google Scholar] [Crossref]

2. Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. Wiley. [Google Scholar] [Crossref]

3. Campbell, M., McKenzie, J. E., Sowden, A., Katikireddi, S. V., Brennan, S. E., Ellis, S., Hartmann-Boyce, J., Ryan, R., Shepperd, S., Thomas, J., & Thomson, H. (2020). Synthesis without meta-analysis (SWiM) in systematic reviews: Reporting guideline. BMJ, 368, l6890. https://doi.org/10.1136/bmj.l6890 [Google Scholar] [Crossref]

4. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104 [Google Scholar] [Crossref]

5. Cummings, R. E., Monroe, S. M., Watkins, M. (2024). Generative AI in first-year writing: An early analysis of affordances, limitations, and a framework for the future. Computers and Composition, 71, 102827. https://doi.org/10.1016/j.compcom.2024.102827 [Google Scholar] [Crossref]

6. Deng, R., Jiang, M., Yu, X., Lu, Y., & Liu, S. (2025). Does ChatGPT enhance student learning? A systematic review and meta-analysis of experimental studies. Computers & Education, 227, 105224. https://doi.org/10.1016/j.compedu.2024.105224 [Google Scholar] [Crossref]

7. Fang, S., & Han, Z. (2025). On the nascency of ChatGPT in foreign language teaching and learning. Annual Review of Applied Linguistics, 45, 148–178. https://doi.org/10.1017/S026719052510010X [Google Scholar] [Crossref]

8. Han, Z. (2024). ChatGPT in and for second language acquisition: A call for systematic research. Studies in Second Language Acquisition, 46(2), 301–306. https://doi.org/10.1017/S0272263124000111 [Google Scholar] [Crossref]

9. Jiang, Y., Hao, J., Fauss, M., & Li, C. (2024). Detecting ChatGPT-generated essays in a large-scale writing assessment: Is there a bias against non-native English speakers? Computers & Education, 217, 105070. https://doi.org/10.1016/j.compedu.2024.105070 [Google Scholar] [Crossref]

10. Jiang, Y., Zhang, M., Hao, J., Deane, P., & Li, C. (2024). Using keystroke behavior patterns to detect nonauthentic texts in writing assessments: Evaluating the fairness of predictive models. Journal of Educational Measurement, 61(4), 571–594. https://doi.org/10.1111/jedm.12431 [Google Scholar] [Crossref]

11. Li, J., Huang, J., Wu, W., & Whipple, P. B. (2024). Evaluating the role of ChatGPT in enhancing EFL writing assessments in classroom settings: A preliminary investigation. Humanities & Social Sciences Communications, 11, 1268. https://doi.org/10.1057/s41599-024-03755-2 [Google Scholar] [Crossref]

12. Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., & Zou, J. (2023). GPT detectors are biased against non-native English writers. Patterns, 4(7), 100779. https://doi.org/10.1016/j.patter.2023.100779 [Google Scholar] [Crossref]

13. Lo, C. K., Yu, P. L. H., Xu, S., Ng, D. T. K., & Jong, M. S. Y. (2024). Exploring the application of ChatGPT in ESL/EFL education and related research issues: A systematic review of empirical studies. Smart Learning Environments, 11, 50. https://doi.org/10.1186/s40561-024-00342-5 [Google Scholar] [Crossref]

14. Lu, Q., Yao, Y., Xiao, L., & Yuan, M. (2024). Can ChatGPT effectively complement teacher assessment of undergraduate students’ academic writing? Assessment & Evaluation in Higher Education. Advance online publication. https://doi.org/10.1080/02602938.2024.2301722 [Google Scholar] [Crossref]

15. McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276–282. https://www.biochemia-medica.com/en/journal/22/3/10.11613/BM.2012.031/fullArticle [Google Scholar] [Crossref]

16. Ouzzani, M., Hammady, H., Fedorowicz, Z., & Elmagarmid, A. (2016). Rayyan—A web and mobile app for systematic reviews. Systematic Reviews, 5, 210. https://doi.org/10.1186/s13643-016-0384-4 [Google Scholar] [Crossref]

17. Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., … & Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. https://doi.org/10.1136/bmj.n71 [Google Scholar] [Crossref]

18. Paul, J., Lim, W. M., & O’Cass, A. (2021). Scientific procedures and rationales for systematic literature reviews (SPAR-4-SLR). International Journal of Consumer Studies, 45(6), 1513–1527. https://doi.org/10.1111/ijcs.12695 [Google Scholar] [Crossref]

19. Paul, J., & Rosado-Serrano, A. (2019). Gradual internationalization vs born-global/ international new venture models: A review and research agenda. International Marketing Review, 36(6), 830–858. https://doi.org/10.1108/IMR-10-2018-0280 [Google Scholar] [Crossref]

20. Rethlefsen, M. L., Kirtley, S., Waffenschmidt, S., Ayala, A. P., Moher, D., Page, M. J., & Koffel, J. B. (2021). PRISMA-S: An extension to the PRISMA statement for reporting literature searches in systematic reviews. Systematic Reviews, 10, 39. https://doi.org/10.1186/s13643-020-01542-z [Google Scholar] [Crossref]

21. SCImago (n.d.). SJR—SCImago Journal & Country Rank [Portal]. Retrieved October 12, 2025, from https://www.scimagojr.com [Google Scholar] [Crossref]

22. Sterne, J. A. C., Hernán, M. A., Reeves, B. C., Savović, J., Berkman, N. D., Viswanathan, M.,… & Higgins, J. P. T. (2016). ROBINS-I: A tool for assessing risk of bias in non-randomised studies of interventions. BMJ, 355, i4919. https://doi.org/10.1136/bmj.i4919 [Google Scholar] [Crossref]

23. Sterne, J. A. C., Savović, J., Page, M. J., Elbers, R. G., Blencowe, N. S., Boutron, I., … & Higgins, J. P. T. (2019). RoB 2: A revised tool for assessing risk of bias in randomised trials. BMJ, 366, l4898. https://doi.org/10.1136/bmj.l4898 [Google Scholar] [Crossref]

24. Tarchi, C., Zappoli, A., Casado Ledesma, L., & Wennås Brante, E. (2025). The use of ChatGPT in source-based writing tasks. International Journal of Artificial Intelligence in Education, 35, 858–878. https://doi.org/10.1007/s40593-024-00413-1 [Google Scholar] [Crossref]

25. Teng, M. F. (2024). “ChatGPT is the companion, not enemies”: EFL learners’ perceptions and experiences in using ChatGPT for feedback in writing. Computers & Education: Artificial Intelligence, 7, 100270. https://doi.org/10.1016/j.caeai.2024.100270 [Google Scholar] [Crossref]

26. Tram, N. H. M., Nguyen, T. T., & Tran, C. D. (2024). ChatGPT as a tool for self-learning English among EFL learners: A multi-methods study. System, 127, 103528. https://doi.org/10.1016/j.system.2024.103528 [Google Scholar] [Crossref]

27. Tsai, C. Y., Lin, Y. T., & Brown, I. K. (2024). Impacts of ChatGPT-assisted writing for EFL English majors: Feasibility and challenges. Education and Information Technologies, 29(17), 22427–22445. https://doi.org/10.1007/s10639-024-12722-y [Google Scholar] [Crossref]

28. Uchida, S. (2024). Evaluating the accuracy of ChatGPT in assessing writing and speaking: A verification study using ICNALE GRA. Learner Corpus Studies in Asia and the World, 6, 1–12. https://doi.org/10.24546/0100487710 [Google Scholar] [Crossref]

29. Üstünbaş, Ü. (2024). Hey, GPT, can we have a chat? A case study on EFL learners’ AI speaking practice. International Journal of Modern Education Studies, 8(1), 91–107. https://doi.org/10.51383/ijonmes.2024.318 [Google Scholar] [Crossref]

30. Yang, L., & Li, R. (2024). ChatGPT for L2 learning: Current status and implications. System, 124, 103351. https://doi.org/10.1016/j.system.2024.103351 [Google Scholar] [Crossref]

31. Zare, J., Al-Issa, A., & Ranjbaran Madiseh, F. (2025). Interacting with ChatGPT in essay writing: A study of L2 learners’ task motivation. ReCALL, 37(3), 385–402. https://doi.org/10.1017/S0958344025000035 [Google Scholar] [Crossref]

Metrics

Views & Downloads

Similar Articles