Basic RAG - Based AI Assistant (AAIDC)

Abstract

In the modern digital ecosystem, individuals continuously generate large volumes of fragmented personal knowledge through documents, notes, academic materials, messages, and project files. Over time, this information becomes difficult to retrieve and loses contextual relevance. This publication presents a Retrieval-Augmented Generation (RAG)–based AI system inspired by the concept of AI as a Personal Knowledge Archaeologist, where artificial intelligence retrieves, interprets, and reconstructs forgotten personal knowledge. The system integrates semantic embeddings, similarity-based retrieval, and controlled generative reasoning using the Gemini API. The proposed approach demonstrates how personal data can be transformed into an active, contextual knowledge system rather than static storage.

1. Introduction

In the modern digital ecosystem, individuals continuously generate vast amounts of unstructured and semi-structured information through emails, documents, notes, learning platforms, and online interactions. Although this information contains valuable personal and contextual knowledge, it often remains fragmented, poorly organized, and difficult to retrieve when needed. Traditional keyword-based search systems fail to capture semantic meaning, leading to information overload rather than actionable insight.

Recent advances in Large Language Models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation. However, standalone LLMs are inherently limited by static training data, lack of personalization, and a tendency to hallucinate when responding to domain-specific or private knowledge queries. These limitations make them unsuitable for reliable personal knowledge management without external grounding.

Retrieval-Augmented Generation (RAG) has emerged as a practical solution to these challenges by combining semantic retrieval mechanisms with generative reasoning. By retrieving relevant context from an external knowledge source and conditioning the generation process on that context, RAG systems significantly improve factual accuracy, transparency, and controllability. This approach enables LLMs to reason over user-specific or domain-specific information without retraining the model.

This work presents a lightweight RAG-based personal knowledge assistant designed around the concept of AI as a Knowledge Archaeologist. Similar to how an archaeologist excavates meaningful artifacts from fragmented remains, the system retrieves semantically relevant knowledge fragments from a curated text repository and reconstructs coherent, context-aware responses. The system is intentionally designed to be minimal, interpretable, and beginner-friendly, while maintaining professional rigor in retrieval and grounding mechanisms.

The proposed implementation leverages Gemini embedding and generation models, cosine similarity-based retrieval, and strict context-only prompting to ensure hallucination-free responses. By avoiding heavy infrastructure dependencies such as vector databases, the system serves as an accessible reference architecture for students, researchers, and practitioners seeking to understand the foundational principles of RAG systems.

2. Conceptual Foundation: Archaeology and AI

An archaeologist studies human history by excavating and interpreting physical artifacts to understand cultural, social, and technological evolution. Similarly, this AI system excavates textual artifacts such as notes, research drafts, and learning material.


     **Archaeology**                                             **RAG-Based AI System**                                                                                                                                                                                                                
   Physical artifacts	                                          Textual knowledge documents                                                              
   Excavation	                                                  Semantic retrieval                                                              
   Contextual interpretation	                                  Generative reasoning                                               
   Preservation of history	                                    Long-term knowledge memory

This analogy forms the conceptual backbone of the system.

3.System Architecture

┌───────────────────────────────┐
│ User Query │
└───────────────┬───────────────┘
│
┌───────────────▼───────────────┐
│ Query Embedding Layer │
│ Model: text-embedding-004 │
└───────────────┬───────────────┘
│
┌───────────────▼───────────────┐
│ Knowledge Base │
│ knowledge.txt │
│ (Paragraph Documents) │
└───────────────┬───────────────┘
│
┌───────────────▼───────────────┐
│ Document Embedding Layer │
│ Model: text-embedding-004 │
└───────────────┬───────────────┘
│
┌───────────────▼───────────────┐
│ Retrieval Mechanism │
│ Cosine Similarity Search │
└───────────────┬───────────────┘
│
┌───────────────▼───────────────┐
│ Similarity Threshold Check │
│ ( ≥ 0.3 ) │
└───────────────┬───────────────┘
┌───────┴────────┐
│ │
┌───────▼────────┐ ┌─────▼─────────────────┐
│ Out-of-Scope │ │ Relevant Context │
│ / No Answer │ │ Selected │
└────────────────┘ └───────────┬────────────┘
│
┌───────────────────────────────▼───────────────┐
│ Context-Aware Prompt Construction │
└───────────────────────────────┬───────────────┘
│
┌───────────────────────────────▼───────────────┐
│ Generation Layer │
│ Model: gemini-2.5-flash │
└───────────────────────────────┬───────────────┘
│
┌───────────────────────────────▼───────────────┐
│ Grounded Answer Output │
└───────────────────────────────────────────────┘

4. Implementation Overview

The system is developed in Python with an emphasis on modularity, transparency, and minimal external dependencies. Each stage of the RAG pipeline is implemented as an independent function, enabling easy debugging and extensibility.

Implementation Highlights

Secure API key management using environment variables
Manual implementation of cosine similarity for full control over retrieval behaviour
Direct integration with Gemini embedding and generation endpoints
Command-line interface enabling real-time interactive querying

Operational Workflow

A user inputs a natural language query via the CLI
The query is converted into a semantic vector representation
Stored document embeddings are compared using cosine similarity
The highest-scoring document above the similarity threshold is selected
A grounded response is generated using the retrieved context

5. Use Case Scenarios

This system demonstrates strong applicability across domains:

Students: Rediscover concepts from earlier semesters and connect them with current studies.

Researchers: Revisit unfinished ideas and prior drafts for renewed insight.

Professionals: Extract lessons and patterns from past projects.

Lifelong Learners: Maintain a continuously evolving personal knowledge base.

6. Ethical and Design Considerations

The major challenges in personal RAG systems include:
Privacy protection of personal data
Context prioritization over excessive retrieval
Relevance filtering to avoid cognitive overload

This implementation addresses these concerns through local document control, similarity thresholds, and constrained generation.

7. Results and Observations

The system successfully:

Prevents off-topic responses
Maintains grounding in personal knowledge
Demonstrates explainable retrieval behaviour
Enables context-aware reasoning beyond traditional assistants
Even with a small knowledge base, the system exhibits scalable design principles applicable to larger datasets.

Screenshot 2026-01-03 133120.png

8. Conclusion

This project demonstrates that Retrieval-Augmented Generation can function as a cognitive archaeology system, capable of rediscovering and contextualizing personal knowledge over time. By combining semantic embeddings with controlled generative reasoning, AI can evolve from a reactive assistant into a long-term intellectual partner.

AI as a Personal Knowledge Archaeologist represents a shift from short-term interaction models to memory-centric, human-aligned intelligence systems, redefining how individuals interact with their own intellectual history.

References

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., … Riedel, S. (2020).
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.
Advances in Neural Information Processing Systems (NeurIPS).
Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., … Yih, W. (2020).
Dense Passage Retrieval for Open-Domain Question Answering.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Google DeepMind.
Gemini API Documentation: Embeddings and Generative Models.
https://ai.google.dev
Reimers, N., & Gurevych, I. (2019).
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.
Manning, C. D., Raghavan, P., & Schütze, H. (2008).
Introduction to Information Retrieval.
Cambridge University Press.
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021).
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency.
Loper, E., & Bird, S. (2002).
NLTK: The Natural Language Toolkit.
Proceedings of the ACL Workshop on Effective Tools and Methodologies for Teaching NLP.
Russell, S., & Norvig, P. (2021).
Artificial Intelligence: A Modern Approach (4th ed.).
Pearson Education.