Evaluating the Impact of Prompt Engineering on Factual Accuracy and Hallucination in Large Language Models

Diya Jain; Pallak Anand; Dr. Deepti Sharma

doi:10.51244/IJRSI.2026.1304000068

Evaluating the Impact of Prompt Engineering on Factual Accuracy and Hallucination in Large Language Models

Authors

Diya Jain

Masters of Computer Applications, Jagan Institute of Management Studies, Rohini, Delhi – 110085 (India)

Pallak Anand

Masters of Computer Applications, Jagan Institute of Management Studies, Rohini, Delhi – 110085 (India)

Dr. Deepti Sharma

Masters of Computer Applications, Jagan Institute of Management Studies, Rohini, Delhi – 110085 (India)

Article Information

DOI: 10.51244/IJRSI.2026.1304000068

Subject Category: Artificial Intelligence

Volume/Issue: 13/4 | Page No: 674-687

Publication Timeline

Submitted: 2026-04-06

Accepted: 2026-04-12

Published: 2026-04-30

Abstract

The propensity of large language models (LLMs) to generate factually unsupported yet linguistically convincing text—commonly referred to as hallucination—poses a fundamental obstacle to their adoption in accuracy-critical settings. This paper investigates whether prompt engineering techniques can meaningfully reduce hallucination and strengthen user-perceived factual reliability. A sequential mixed-methods design was employed: a systematic review of fourteen peer-reviewed sources spanning 2017–2026, combined with an original empirical survey of 96 participants [15] who evaluated AI-generated responses across three prompting conditions—basic (A), structured (B), and detailed/context-rich (C). Perceived accuracy rates were calculated per question and condition, and a weighted completeness metric was derived to quantify informational depth across conditions. Results indicate that 56.3% of respondents maintain only partial trust in AI-generated facts and that users systematically prefer brief responses irrespective of their informational completeness—a behavioural pattern termed the brevity-trust bias. Step-by-step instruction was the most endorsed prompting strategy (55.2%), independently corroborating chain-of-thought prompting from the scholarly literature. Objective analysis further shows that basic prompts yielded the lowest weighted completeness scores across all five questions despite dominating user preference. The study concludes with a five-component integrated mitigation framework combining user-side prompting, retrieval-augmented generation (RAG), reinforcement learning from human feedback (RLHF), automated fact-checking, and structured user education.

Keywords

Hallucination; Prompt Engineering; Factual Accuracy

Downloads

PDF JATS XML

References

1. A. Vaswani et al., "Attention Is All You Need," NeurIPS, vol. 30, pp. 5998–6008, 2017. https://arxiv.org/abs/1706.03762 [Google Scholar] [Crossref]

2. T. B. Brown et al., "Language Models are Few-Shot Learners," NeurIPS, vol. 33, pp. 1877–1901, 2020. https://arxiv.org/abs/2005.14165 [Google Scholar] [Crossref]

3. J. Wei et al., "Chain-of-Thought Prompting Elicits Reasoning in LLMs," NeurIPS, vol. 35, 2022. https://arxiv.org/abs/2201.11903 [Google Scholar] [Crossref]

4. L. Ouyang et al., "Training LMs to Follow Instructions with Human Feedback," NeurIPS, vol. 35, pp. 27730–27744, 2022. https://arxiv.org/abs/2203.02155 [Google Scholar] [Crossref]

5. Z. Ji et al., "Survey of Hallucination in NLG," ACM Comput. Surv., vol. 55, no. 12, 2023. https://doi.org/10.1145/3571730 [Google Scholar] [Crossref]

6. S. Lin, J. Hilton, and O. Evans, "TruthfulQA," Proc. 60th ACL, pp. 3214–3252, 2022. https://arxiv.org/abs/2109.07958 [Google Scholar] [Crossref]

7. P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP," NeurIPS, vol. 33, pp. 9459–9474, 2020. https://arxiv.org/abs/2005.11401 [Google Scholar] [Crossref]

8. A. Alansari and H. Luqman, "LLM Hallucination: A Comprehensive Survey," arXiv:2510.06265, 2026. https://arxiv.org/abs/2510.06265 [Google Scholar] [Crossref]

9. L. Zhang et al., "A Survey on Hallucination in LLMs," ACM Trans. Inf. Syst., 2024. https://doi.org/10.1145/3703155 [Google Scholar] [Crossref]

10. S. Srivastava et al., "Survey and Analysis of Hallucinations in LLMs," Front. Artif. Intell., vol. 8, 2025. https://doi.org/10.3389/frai.2025.1622292 [Google Scholar] [Crossref]

11. S. Sahoo et al., "Systematic Survey of Prompt Engineering in LLMs," arXiv:2402.07927, 2024. https://arxiv.org/abs/2402.07927 [Google Scholar] [Crossref]

12. OpenAI, "GPT-4 Technical Report," arXiv:2303.08774, 2023. https://arxiv.org/abs/2303.08774 [Google Scholar] [Crossref]

13. H. Touvron et al., "Llama 2: Open Foundation and Fine-Tuned Chat Models," arXiv:2307.09288, 2023. https://arxiv.org/abs/2307.09288 [Google Scholar] [Crossref]

14. Y. Deng et al., "Detecting Factual Hallucinations with Metamorphic Testing," PACMSE, vol. 2, 2024. https://doi.org/10.1145/3715784 [Google Scholar] [Crossref]

15. D. Jain and P. Anand, "Impact of Prompt Engineering on AI Accuracy – Survey Response Dataset," Primary Survey Data, N=96, JIMS Delhi, Feb. 2026. [Raw data file available with authors]. [Google Scholar] [Crossref]

16. D. Sharma, B. A. Saxena, and D. Aggarwal, "Smart Education: An Emerging Teaching Pedagogy for Interactive and Adaptive Learning Methods," Journal of Learning and Educational Policy, vol. 44, pp. 1–9, 2024. [Google Scholar] [Crossref]

17. D. Sharma, B. A. Saxena, D. Aggarwal, and A. B. Saxena, "Exploring the Role of AI for Enhancement of Social Media Marketing," Journal of Media, Culture and Communication, vol. 4, no. 5, pp. 1–11, 2024. [Google Scholar] [Crossref]

18. D. Sharma, B. A. Saxena, and D. Aggarwal, "Mitigating Cybersecurity Risks in IoT: A Layered Approach to Threat Detection and Prevention," in Proc. 2025 4th Int. Conf. Sentiment Analysis and Deep Learning (ICSADL), IEEE, 2025. [Google Scholar] [Crossref]

19. D. Sharma, B. A. Saxena, and D. Aggarwal, "A Comprehensive Analysis on the Application of Natural Language Processing (NLP) in Higher Education," in Proc. 2024 8th Int. Conf. I-SMAC (IoT in Social, Mobile, Analytics and Cloud), IEEE, 2024. [Google Scholar] [Crossref]

20. D. Sharma, B. A. Saxena, and D. Aggarwal, "Green AI: Balancing Model Complexity and Energy Footprint in Deep Learning," in Proc. 2025 3rd Int. Conf. Sustainable Computing and Data Communication Systems (ICSCDS), IEEE, 2025. [Google Scholar] [Crossref]

Evaluating the Impact of Prompt Engineering on Factual Accuracy and Hallucination in Large Language Models

Authors

Article Information

Publication Timeline

Abstract

Keywords

Downloads

References

Metrics

Views & Downloads

Similar Articles