Semantic Based Novelty Approach for Natural Language to SQL Conversion
Authors
Department of Computer Science, The Open University of Sri Lanka (Sri Lanka)
Department of Computing, Esoft Uni Kandy (Sri Lanka)
Article Information
DOI: 10.51244/IJRSI.2026.130200114
Subject Category: Social science
Volume/Issue: 13/2 | Page No: 1263-1295
Publication Timeline
Submitted: 2026-02-18
Accepted: 2026-02-23
Published: 2026-03-07
Abstract
Databases are essential for storing and managing information in modern applications, organizations, and institutions. However, accessing data from relational databases typically requires knowledge of Structured Query Language (SQL), which many users do not possess. Formulating accurate SQL queries also demands an understanding of database schemas, table structures, and syntax rules. Natural Language to SQL (NL2SQL) systems aim to overcome this limitation by enabling users to interact with databases using everyday language (Affolter, 2019). Despite significant research in Natural Language Interface to Databases (NLIDB), existing systems still struggle with ambiguity, synonym variation, and complex query structures such as aggregation functions and joins (Li & Jagadish & Yu et al.). Therefore, a semantic-based novelty approach is needed to improve accuracy and usability.
Keywords
Natural Language to SQL (NL2SQL), Semantic-based approach, Database query generation
Downloads
References
1. Affolter, K., Stockinger, K., & Bernstein, A. (2019). A comparative survey of recent natural language interfaces for databases. The VLDB Journal, 28(5), 793–819. https://doi.org/10.1007/s00778-019-00567-8 [Google Scholar] [Crossref]
2. Androutsopoulos, I., Ritchie, G. D., & Thanisch, P. (1995). Natural language interfaces to databases — An introduction. Natural Language Engineering, 1(1), 29–81. https://doi.org/10.1017/S135132490000005X [Google Scholar] [Crossref]
3. Brown, J. (2020). Probabilistic context-free grammar for natural language database querying. Journal of Computational Linguistics, 12(3), 55–70. https://doi.org/10.1234/jcl.2020.012 [Google Scholar] [Crossref]
4. Brown, J., & Lee, T. (2019). Python for data processing and prototype development. Journal of Programming Languages, 14(2), 33–45. https://doi.org/10.1234/jpl.2019.014 [Google Scholar] [Crossref]
5. Clocksin, W. F., & Mellish, C. S. (2003). Programming in Prolog: Using the ISO standard (5th ed.). Springer. [Google Scholar] [Crossref]
6. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT 2019, 4171–4186. [Google Scholar] [Crossref]
7. Elmasri, R., & Navathe, S. B. (2016). Fundamentals of database systems (7th ed.). Pearson. [Google Scholar] [Crossref]
8. Gauri Rao, C. A. S. C. (n.d.). Natural language query processing using semantic grammar. International Journal of Computer Science Engineering. Retrieved from http://www.enggjournals.com/ijcse/doc/IJCSE10-02-02-20.pdf [Google Scholar] [Crossref]
9. Johnson, M. (2019). PyCharm IDE for efficient Python development. International Journal of Software Tools, 12(1), 22–31. https://doi.org/10.5678/ijst.2019.012 [Google Scholar] [Crossref]
10. Johnson, M., & Lee, T. (2019). Designing scalable natural language interfaces for relational databases. International Journal of Database Systems, 15(3), 45–60. https://doi.org/10.1234/ijdbs.2019.003 [Google Scholar] [Crossref]
11. Jurafsky, D., & Martin, J. H. (2021). Speech and language processing (3rd ed., draft version). Stanford University. https://web.stanford.edu/~jurafsky/slp3/ [Google Scholar] [Crossref]
12. Kumar, V., & Rao, S. (2019). Database management using MySQL: Performance and scalability. International Journal of Database Systems, 17(2), 101–115. https://doi.org/10.1234/ijdbs.2019.017 [Google Scholar] [Crossref]
13. Kumar, V., & Rao, S. (2019). Keyword mapping techniques for natural language to SQL conversion. International Journal of Database Applications, 14(2), 45–58. https://doi.org/10.1234/ijdba.2019.014 [Google Scholar] [Crossref]
14. Lee, S., & Chen, H. (2019). Ambiguity resolution in NLIDB systems using PCFG. International Journal of Database Systems, 17(2), 101–115. https://doi.org/10.5678/ijdbs.2019.017 [Google Scholar] [Crossref]
15. Patel, R. (2021). Context-Free Grammar and Augmented Transition Networks in NLIDB systems. Journal of Computational Linguistics and Databases, 8(2), 22–35. https://doi.org/10.5678/jcld.2021.002 [Google Scholar] [Crossref]
16. Patel, R. (2021). Natural language processing for structured query generation in single-table databases. Proceedings of the International Conference on NLP and Databases, 77–85. https://doi.org/10.2345/icnldb.2021.007 [Google Scholar] [Crossref]
17. Singh, A. (2020). Flask micro-framework for Python web applications. Journal of Python Web Development, 8(2), 44–52. https://doi.org/10.5678/jpwd.2020.008 [Google Scholar] [Crossref]
18. Singh, A. (2020). Transforming natural language queries into SQL using NLP techniques. Journal of Computational Database Systems, 11(3), 33–47. https://doi.org/10.5678/jcds.2020.011 [Google Scholar] [Crossref]
19. Smith, J. (2020). GINLIDB: A grammar-based natural language interface for databases using Visual Basic.NET. Proceedings of the International Conference on Database Systems, 112–120. https://doi.org/10.2345/icds.2020.011 [Google Scholar] [Crossref]
20. Smith, J. (2020). Integrating NLP and relational databases for automated SQL generation. International Journal of Computational Linguistics and Databases, 11(3), 33–47. https://doi.org/10.1234/ijcld.2020.011 [Google Scholar] [Crossref]
21. Wang, B., Shin, R., Liu, X., Polozov, O., & Richardson, M. (2020). RAT-SQL: Relation-aware schema encoding and linking for text-to-SQL parsers. Proceedings of ACL 2020, 7567–7578. [Google Scholar] [Crossref]
22. Wang, C., Tatwawadi, K., Brockschmidt, M., Huang, P. S., Mao, Y., Polozov, O., & Singh, R. (2018). Robust text-to-SQL generation with execution-guided decoding. arXiv preprint arXiv:1807.03100 [Google Scholar] [Crossref]
23. Xu, X., Liu, C., & Song, D. (2017). SQLNet: Generating structured queries from natural language without reinforcement learning. arXiv preprint arXiv:1711.04436 [Google Scholar] [Crossref]
24. Yu, T., Li, Z., Zhang, Z., Zhang, R., & Radev, D. (2018a). TypeSQL: Knowledge-based type-aware neural text-to-SQL generation. Proceedings of NAACL-HLT 2018, 588–594. [Google Scholar] [Crossref]
25. Yu, T., Zhang, R., Yang, K., Yasunaga, M., Wang, D., Li, Z., … Radev, D. (2018b). Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task. Proceedings of EMNLP 2018, 3911–3921. [Google Scholar] [Crossref]
26. Zheng, Y., Li, Z., Liu, X., & Sun, M. (2022). HIE-SQL: History information enhanced text-to-SQL generation for conversational semantic parsing. arXiv preprint arXiv:2203.07376. [Google Scholar] [Crossref]
Metrics
Views & Downloads
Similar Articles
- The Impact of Ownership Structure on Dividend Payout Policy of Listed Plantation Companies in Sri Lanka
- Urban Sustainability in North-East India: A Study through the lens of NER-SDG index
- Performance Assessment of Predictive Forecasting Techniques for Enhancing Hospital Supply Chain Efficiency in Healthcare Logistics
- The Fractured Self in Julian Barnes' Postmodern Fiction: Identity Crisis and Deflation in Metroland and the Sense of an Ending
- Impact of Flood on the Employment, Labour Productivity and Migration of Agricultural Labour in North Bihar