A lightweight, local, knowledge-grounded chatbot that answers questions based on your own documents---powered by LangChain, FAISS, and Google Gemini.
This project implements a local knowledge-based chatbot that intelligently retrieves information from text documents stored in the knowledge_base/ directory.
It uses:
LangChain -- for document loading, vector stores, and prompt handling
Google Gemini LLM (via ChatGoogleGenerativeAI)
FAISS -- for vector search and similarity lookups
Python CLI interface -- for a simple, real-time chat experience
The chatbot workflow:
Load documents from the knowledge base
Convert text into embeddings
Store those embeddings inside a FAISS vector index
Retrieve relevant document chunks during Q&A
Use Gemini LLM to generate accurate, context-aware answers
This project demonstrates:
Document ingestion
Embedding creation
FAISS vector search
Prompt engineering
Retrieval-augmented generation (RAG)
Real-time command-line chatbot interaction
knowledge_base/ -> Load Documents -> Create Embeddings ↓ ↓ ↓ User Question -> Retrieve Similar Chunks -> Gemini LLM Answer
Add .txt files to the knowledge_base/ folder
The program loads all text files using LangChain
Text is chunked and converted to vector embeddings using the Gemini Embedding API
FAISS builds a searchable vector index
When the user enters a question:
The system performs similarity search in FAISS
Retrieves the most relevant document chunks
Feeds them into a prompt template
Gemini LLM produces a final answer based on document context
Here is a minimal example showing how FAISS and Gemini are used together:
`from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import DirectoryLoader
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
faiss_index = FAISS.from_documents(document_objects, embeddings)
llm = ChatGoogleGenerativeAI(
model="gemini-2.5-flash",
temperature=0.2,
max_tokens=200,
timeout=None,
max_retries=2,
)
qa_chain = RetrievalQA.from_chain_type(
llm=llm, chain_type="stuff", retriever=retriever
)
return qa_chain.invoke(prompt_structure)
def start_chat():
print("Welcome to the Chatbot! Type 'exit' to end the chat.\n")
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
KB_DIR = os.path.join(BASE_DIR, "knowledge_base")
# Documents
documents = load_documents_from_directory(KB_DIR)
# Load documents into FAISS
faiss_index = load_documents_to_faiss(documents)
while True:
# Get user input (question)
question = input("You: ")
# Exit condition
if question.lower() == "exit":
print("Goodbye!")
break
# Get the answer from the model
answer = ask_question(question, faiss_index)
result = answer.get("result", "").strip()
# If empty → tell user it was not found
if not result:
print("AI: It did not find anything.\n")
else:
print(f"AI: {result}\n")
├── knowledge_base/ │ ├── 1.txt │ ├── 2.txt │ └── 3.txt ├── chatbot.py ├── main.py ├── faiss_index.py └── README.md
+--------------------+ | knowledge_base/ | | (text documents) | +----------+---------+ | v +--------------------+ | Gemini Embeddings | +--------------------+ | v +--------------------+ | FAISS Index | +--------------------+ | v +----------------------------+ | Similarity Search (RAG) | +----------------------------+ | v +------------------------------+ | Gemini LLM Answer Generator | +------------------------------+