RAG Assistant: Publication Documentation
Overview
The RAG Assistant is a Retrieval-Augmented Generation (RAG) system designed to answer questions using a local language model and a vector store. Developed as part of the Agentic AI Developer Certification (AIDC-Module 1), this project addresses the need for an offline, customizable question-answering tool, overcoming limitations like OpenAI API quotas. Itβs built with Python and open-source libraries, making it accessible for AI enthusiasts and developers.
Purpose: Provide accurate, context-based answers from user-supplied documents.
Target Audience: AI developers, students, and certification candidates.
License: MIT (open-source, permissive use).
ReadyTensor Publication Link (to be updated post-submission)GitHub Inspiration (for LangChain usage)
Note: Replace with an actual diagram of the RAG pipeline (e.g., retrieval β generation flow).
Technical Details
Components
The RAG Assistant leverages the following technologies:
LangChain: Orchestrates the RAG pipeline (retrieval, generation).
Hugging Face Transformers: Powers the local LLM (e.g., facebook/bart-base).
FAISS: Enables efficient vector storage and similarity search.
HuggingFaceEmbeddings: Uses sentence-transformers/all-MiniLM-L6-v2 for text embeddings.
Workflow
Ingestion: Splits and embeds documents into a FAISS vector store.
Retrieval: Fetches the most relevant document chunk using MMR (Maximal Marginal Relevance).
Generation: Generates answers using the LLM, guided by a structured prompt.
Output: Displays the parsed response via CLI.
Code Snippet: Core Pipeline
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_huggingface import HuggingFacePipeline
def setup_rag_chain():
vector_store = load_vector_store()
retriever = vector_store.as_retriever(search_type="mmr", search_kwargs={"k": 1, "fetch_k": 5})
prompt_template = """
You are a factual assistant. Answer ONLY using the exact information provided in the context below.
Do not add information or generate text outside the context. If the context does not contain the answer,
respond with 'I donβt know.'
Context:
{context}
Question: {question}
Answer:
"""
prompt = PromptTemplate.from_template(prompt_template)
def format_docs(x):
print(f"Debug docs type in format_docs: {type(x)}, content: {x}")
if isinstance(x, list):
return "\n\n".join([d.page_content for d in x])
return x
chain = (
{"context": RunnableLambda(lambda x: retriever.invoke(x["question"])) | RunnableLambda(format_docs),
"question": RunnablePassthrough()}
| prompt
| llm
| parse_answer
)
return chain
LLM Configuration
Model: facebook/bart-base (switch from distilgpt2 for better context adherence).
Parameters: max_new_tokens=50, temperature=0.2, repetition_penalty=2.0.
Installation: pip install transformers sentencepiece.
Usage Instructions
Setup
Install dependencies
install langchain-huggingface transformers torch sentencepiecePlace your document (e.g., sample.txt) in data/documents/.
Configure config/.env with an OPENAI_API_KEY (optional for local use).
Running the Assistant
Ingest Data
app.py ingestExpected Output: "Ingested documents into vector store."
Query Interface
Enter questions like "What is the capital of France?".
Type 'exit' to quit.
Expected Answer: "The capital of France is Paris."
Note: Replace with a screenshot of the CLI in action.
Sample Questions and Answers
"What is the capital of France?" β "The capital of France is Paris."
"Name a landmark in Paris" β "Eiffel Tower."
"When was the Eiffel Tower completed?" β "1889."
Note: Answers depend on the context in sample.txt.
Challenges and Solutions
Issue: Retrieved entire sample.txt.
Solution: Implemented chunking with RecursiveCharacterTextSplitter and MMR retrieval.
Issue: Nonsensical answers with distilgpt2.
Solution: Switched to facebook/bart-base with a stricter prompt.
Issue: TypeError in chain.
Solution: Used RunnableLambda for proper function piping.
Future Enhancements
Add memory to track conversation context.
Integrate a GUI using Tkinter or Flask.
Support multiple LLMs for comparison.
Submission Details
Title: RAG Assistant
Description: A local RAG system for question-answering.
Tags: AIML, python, RAG, GenAI, AgenticAI
Certification Module: AIDC-Module 1
Resources: Upload this .md file, app.py, src/, and sample.txt.
Acknowledgments
Thanks to the xAI community and ReadyTensor support for guidance.ReadyTensor Support | xAI