Extractive Text Summarization for Malayalam News Articles
Authors
Department of Computer Science and Engineering, College of Engineering Trivandrum, Trivandrum (India)
Department of Computer Science and Engineering, College of Engineering Trivandrum, Trivandrum (India)
Article Information
DOI: 10.51244/IJRSI.2026.1303000223
Subject Category: Computer Science
Volume/Issue: 13/3 | Page No: 2591-2600
Publication Timeline
Submitted: 2026-03-23
Accepted: 2026-03-28
Published: 2026-04-18
Abstract
The rapid increase in textual data across digital environments has made automatic text processing an essential component of Natural Language Processing (NLP). Extractive approaches involve evaluating, identifying, and selecting the most relevant sentences and are considered efficient, interpretable, and systematic alternatives to abstractive methods. Previous methods have struggled to capture meaningful semantic relationships and contextual relevance using statistical or rule-based techniques. To address these limitations, this study proposes a headline- guided extractive model that combines multilingual transformer embeddings with linguistic cues to improve relevance and information retention. The system selects sentences based on semantic similarity and syntactic importance, ensuring that the generated summaries are coherent and concise. Additionally, it reduces redundancy, thereby enhancing applicability in real-world tasks.
Keywords
Text summarization, Malayalam news, headline- guided
Downloads
References
1. E. B. Ajmal and R. P. Haroon, “An Extractive Malayalam Document Summarization Based on Graph Theoretic Approach,” 2015 Fifth Inter- national Conference on e-Learning (econf), Manama, Bahrain, 2015, pp. 237-240, DOI: 10.1109/ECONF.2015.41 [Google Scholar] [Crossref]
2. P. Krishnaprasad, A. Sooryanarayanan and A. Ramanujan, “Malayalam text summarization: An extractive approach,” 2016 International Confer- ence on Next Generation Intelligent Systems (ICNGIS), Kottayam, India, 2016, pp. 1-4, DOI: 10.1109/ICNGIS.2016.7854008 [Google Scholar] [Crossref]
3. N. S. Shirwandkar and S. Kulkarni, “Extractive Text Summarization Us- ing Deep Learning,” 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 2018, pp. 1-5, DOI: 10.1109/ICCUBEA.2018.8697465 [Google Scholar] [Crossref]
4. J. N. Madhuri and R. Ganesh Kumar, “Extractive Text Summarization Using Sentence Ranking,” 2019 International Conference on Data Science and Communication (IconDSC), Bangalore, India, 2019, pp. 1-3, DOI: 10.1109/IconDSC.2019.8817040 [Google Scholar] [Crossref]
5. R. P. Haroon, A. G. M, N. Ali and B. N. U, “An Efficient Text Summariza- tion Using Term and Inverse Frequency With Key Phrase Identification in Malayalam Language,” 2021 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON- ECE), Dhaka, Bangladesh, 2021, pp. 145-148, DOI: 10.1109/WIECON-ECE54711.2021.9829671 [Google Scholar] [Crossref]
6. Manju, K., David Peter S., and Sumam Mary Idicula, “An Extractive Multi-document Summarization System for Malayalam News Docu- ments,” College of Engineering Cherthala & Cochin University of Science and Technology (CUSAT), 2015. [Google Scholar] [Crossref]
7. M. S. Ansary, “A Hybrid Approach for Extractive Summarization of Med- ical Documents,” 2021 IEEE International Conference on Biomedical En- gineering, Computer and Information Technology for Health (BECITH- CON), Dhaka, Bangladesh, 2021, pp. 1-4, DOI: 10.1109/BECITH-CON54710.2021.9893674 [Google Scholar] [Crossref]
8. K. Manju, D. P. S. Peter, and S. M. Idicula, “A Framework for Generating Extractive Summary from Multiple Malay- alam Documents,” Inf., vol. 12, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:231733241 [Google Scholar] [Crossref]
9. K. Manju, D. P. S. Peter, and S. M. Idicula, “Extractive summa- rization of Malayalam documents using latent Dirichlet allocation: An experience,” Journal of Intelligent Systems,2022,Volume 31,url= https://api.semanticscholar.org/CorpusID:247765296 [Google Scholar] [Crossref]
10. P. Kositcharoensuk, N. Sritrakool and P. N. Pratanwanich, “Headline- Guided Extractive Summarization for Thai News Articles,” in IEEE Access, vol. 13, pp. 24368-24382, 2025, DOI: 10.1109/AC-CESS.2025.3538329 [Google Scholar] [Crossref]
11. A´ . Herna´ndez-Castan˜eda, R. A. Garc´ıa-Herna´ndez, Y. Ledeneva and C. Milla´n-Herna´ndez, “Extractive Automatic Text Summarization Based on Lexical-Semantic Keywords,” in IEEE Access, vol. 8, pp. 49896- 49907, 2020, DOI: 10.1109/ACCESS.2020.2980226 [Google Scholar] [Crossref]
12. S. Ghodratnama, A. Beheshti, M. Zakershahrak and F. Sobhanmanesh, “Extractive Document Summarization Based on Dynamic Feature Space Mapping,” in IEEE Access, vol. 8, pp. 139084-139095, 2020, DOI: 10.1109/ACCESS.2020.3012539. [Google Scholar] [Crossref]
13. K. -Y. Chen, S. -H. Liu, B. Chen and H. -M. Wang, “An Information Distillation Framework for Extractive Summarization,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 1, pp. 161-170, Jan. 2018, DOI: 10.1109/TASLP.2017.2764545 [Google Scholar] [Crossref]
14. S. JUGRAN, A. KUMAR, B. S. TYAGI and V. ANAND, “Extractive Automatic Text Summarization using SpaCy in Python & NLP,” 2021 International Conference on Advance Computing and Innovative Tech- nologies in Engineering (ICACITE), Greater Noida, India, 2021, pp. 582- 585, DOI: 10.1109/ICACITE51222.2021.9404712 [Google Scholar] [Crossref]
15. A. R. Mishra, V. K. Panchal and P. Kumar, “Extractive Text Summariza- tion - An effective approach to extract information from Text,” 2019 Inter- national Conference on contemporary Computing and Informatics (IC3I), Singapore, 2019, pp. 252-255, DOI: 10.1109/IC3I46837.2019.9055636 [Google Scholar] [Crossref]
16. J. Chen and H. Zhuge, “Extractive Text-Image Summarization Using Multi-Modal RNN,” 2018 14th International Conference on Semantics, Knowledge and Grids (SKG), Guangzhou, China, 2018, pp. 245-248, DOI: 10.1109/SKG.2018.00033 [Google Scholar] [Crossref]
17. R. Alqaisi, W. Ghanem and A. Qaroush, “Extractive Multi-Document Arabic Text Summarization Using Evolutionary Multi-Objective Opti- mization With K-Medoid Clustering,” in IEEE Access, vol. 8, pp. 228206- 228224, 2020, DOI: 10.1109/ACCESS.2020.3046494 [Google Scholar] [Crossref]
18. H. K. M, J. P and A. K. M, “Large Language Models for Indian Legal Text summarization,” 2024 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), Bangalore, India, 2024, pp. 1-5, DOI: 10.1109/CONECCT62155.2024.10677069 [Google Scholar] [Crossref]
19. C. Yongkiatpanich and D. Wichadakul, “Extractive Text Summarization Using Ontology and Graph-Based Method,” 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), Singa- pore, 2019, pp. 105-110, DOI: 10.1109/CCOMS.2019.8821755 [Google Scholar] [Crossref]
20. K. Al-Sabahi, Z. Zuping and M. Nadher, “A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS),” in IEEE Access, vol. 6, pp. 24205-24212, 2018, DOI: 10.1109/AC- CESS.2018.2829199. [Google Scholar] [Crossref]
21. Y. K. Meena, P. Deolia and D. Gopalani, “ Optimal Features Set for Extractive Automatic Text Summarization,” 2015 Fifth International Conference on Advanced Computing & Communication Technologies, Haryana, India, 2015, pp. 35-40, doi: 10.1109/ACCT.2015.123. [Google Scholar] [Crossref]
22. S. S. Benjumea and E. Leo´n, “Genetic Clustering Algorithm for Ex- tractive Text Summarization,“ 2015 IEEE Symposium Series on Compu- tational Intelligence, Cape Town, South Africa, 2015, pp. 949-956, doi: 10.1109/SSCI.2015.139. [Google Scholar] [Crossref]
23. Asgari, B. Masoumi and O. S. Sheijani, “Automatic text summa- rization based on multi-agent particle swarm optimization,” 2014 Iranian Conference on Intelligent Systems (ICIS), Bam, Iran, 2014, pp. 1-5, doi: 10.1109/IranianCIS.2014.6802592. [Google Scholar] [Crossref]
24. R. Ferreira et al., “A Context Based Text Summarization System,” 2014 11th IAPR International Workshop on Document Analysis Systems, Tours, France, 2014, pp. 66-70, doi: 10.1109/DAS.2014.19. keywords: Blogs;Context;Algorithm design and analysis;Abstracts;Gold;Standards;Educational institutions;Text Summarization;Text Summarization Evaluation;Document Engineering, [Google Scholar] [Crossref]
25. R. Ferreira et al., “A Four Dimension Graph Model for Automatic Text Summarization,” 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Atlanta, GA, USA, 2013, pp. 389-396, doi: 10.1109/WI-IAT.2013.55. [Google Scholar] [Crossref]
Metrics
Views & Downloads
Similar Articles
- What the Desert Fathers Teach Data Scientists: Ancient Ascetic Principles for Ethical Machine-Learning Practice
- Comparative Analysis of Some Machine Learning Algorithms for the Classification of Ransomware
- Comparative Performance Analysis of Some Priority Queue Variants in Dijkstra’s Algorithm
- Transfer Learning in Detecting E-Assessment Malpractice from a Proctored Video Recordings.
- Dual-Modal Detection of Parkinson’s Disease: A Clinical Framework and Deep Learning Approach Using NeuroParkNet