1. Auth Service makes sure everything stays secure. It checks every user’s identity using JWT tokens and
role-based access control before letting them in.
2. Document Ingestion takes care of the uploaded PDFs. It extracts all the text, splits it into smaller
chunks, generates metadata, and stores everything neatly in MongoDB — from the raw documents to
chat history.
3. The Query Processor is the system’s decision-maker. When a user asks a question, it searches for
relevant information from the vector database, prepares the context, and coordinates with the AI model to
generate a meaningful, accurate answer.
Data Layer: Each database is responsible for specific tasks in our system.
1. PostgreSQL manages the structured stuff — like user accounts and system logs.
2. MongoDB handles the flexible data of the system— chat records, document logs, etc.
3. ChromaDB stores the document embeddings — the numerical “understandings” of text that allow the
system to match meaning, not just exact words.
LLM Engine: The LLM Engine that powers our system is Ollama. Ollama supports the language model to
understand query and generate results. Once the Query Processor finds the right pieces of information, Ollama
blends them with the user’s question and generates the response that is completely based on the documents
(company data) uploaded to the system by the admins. Hence, this ensures that the results are accurate and
relevant, and not degraded by unnecessary generic information.
In simple terms: You upload your PDFs, the system processes and stores them intelligently, and later, when
you ask something, it digs into your own data to give you the right answer — fast, reliable, and backed by
facts.
CONCLUSIONS
This research offers a thorough explanation of the system “AI - Driven Internal Data Intelligence
Assistant”, the application of RAG and in-depth functionality of the project. The paper explains how our
project is an appropriate solution to resolve the conflict between inclusion of artificial intelligence to promote
productivity and risking the company's data security. By using the RAG pipeline, we demonstrate how AI and
open-source technologies benefit the users without being under control of public tool providers.
The proposed solution successfully completes the necessary checks to make a closed-loop environment for
data processing. The integration of Ollama as a local LLM inference engine is one of the most crucial
foundations of our work. It holds the centre of the system architecture and ensures that no company data leaves
the organisational firewall. The assistant provides the employee with a natural language-based interface that
gives accurate and relevant results and values transparency.
This paper confirms that development of secure corporate tools is needed and can be built with design
explained in this article. The project has been carefully developed in accordance with industry standards.
Future work will include the expansion of features in terms of scalability, role-based access control, optimizing
the document retrieval mechanism, using image/video-based information uploads, and addition of multi-hop
queries. This project stands as evidence to the idea that AI capabilities and data security can coexist.
REFERENCES
1. Xue, L., Chen, M., and Li, Y., “DB-GPT: Empowering Database Interactions with Private LLMs,” IEEE
International Conference on Data Engineering, 2024.
2. Zhang, Y., Wu, H., and Zhao, T., “Confidential Prompting: Privacy-Preserving LLM Inference on Cloud,”
IEEE Conference on Secure and Trustworthy Machine Learning, 2023.
3. Kumar, R., and Singh, A., “E2E Data Extraction Framework from Unstructured Data: Integration of Deep
Learning and Text Mining Techniques,” Journal of Intelligent Information Systems, 2025.
4. Raj, S., Mehta, P., and Bansal, V., “Fine-Tuning Large Language Models for Enterprise Applications,”
IEEE Conference on Artificial Intelligence and Data Science, 2024.
Page 3323