RAG Assistant App

Abstract

This report details the development of a Retrieval-Augmented Generation (RAG) Assistant utilizing LangChain, OpenAI, FAISS, and Streamlit. The assistant allows users to query a custom corpus of technical documents and receive contextually relevant, LLM-generated answers with source attribution. The system demonstrates the practical application of modern retrieval and generation techniques for domain-specific question answering.

Introduction

Large Language Models (LLMs) such as those provided by OpenAI have revolutionized natural language processing, but their knowledge is static and limited to their training data. Retrieval-Augmented Generation (RAG) systems address this limitation by integrating LLMs with external knowledge sources, enabling dynamic, up-to-date, and context-aware responses. This project implements a RAG assistant focused on technical documentation for LangChain and LangGraph, providing accurate, source-attributed answers through an interactive web interface.

Methodology

The RAG assistant is implemented in Python and leverages several key technologies:

LangChain: Provides the framework for chaining LLMs and retrieval components.
OpenAI: Supplies the LLM and embedding models for both document and query processing.
FAISS: Enables efficient vector storage and similarity search for document retrieval.
Streamlit: Delivers a user-friendly web interface for interaction.
dotenv: Manages environment variables, such as API keys.

The workflow is as follows:

Document Ingestion: All .txt files in the docs/ directory are loaded. Each document is split into overlapping chunks (500 characters with 50-character overlap) using a recursive character splitter to ensure context continuity.
Embedding Generation: Each chunk is embedded using OpenAI's embedding model, converting text into high-dimensional vectors that capture semantic meaning.
Vector Store Creation: The resulting embeddings are stored in a FAISS index, allowing for fast similarity-based retrieval.
Retrieval and QA Chain: When a user submits a question, the system embeds the query, retrieves the most relevant document chunks from the FAISS index, and passes them along with the question to the OpenAI LLM using a custom prompt template. The LLM generates an answer based on the retrieved context.
User Interface: Streamlit provides an interactive interface where users can input questions and view answers, along with the sources of the retrieved information.

Experiments

To evaluate the assistant, the following steps were performed:

Document Preparation: A set of technical documents related to LangChain and LangGraph were collected and saved as .txt files in the docs/ directory. Example files include langchain_concept_vectorstores.txt, langchain_concept_retrieval.txt, and others.
Indexing: The assistant was run, triggering the ingestion process. Each document was loaded, split, embedded, and stored in the FAISS vector index.
User Interaction: Using the Streamlit interface, a series of developer questions were posed, such as "What is a vector store in LangChain and how is it used?" The assistant retrieved relevant document chunks and generated answers using the OpenAI LLM, displaying both the answer and the source documents.
Evaluation: The quality of the answers was assessed based on relevance, accuracy, and correct attribution of sources. The speed of retrieval and the usability of the interface were also noted.

Result

Below is an example of an actual input (user question) and the corresponding output (assistant answer and sources):

Input (User Question):

What is a vector store in LangChain and how is it used?

Output (Assistant Answer):

A vector store in LangChain is a specialized data store that enables indexing and retrieving information based on vector representations. It is used to store and query complex relationships between data points, such as social networks, supply-chain management, fraud detection, and recommendation services. It provides a standard interface for working with vector stores, allowing users to easily switch between different vectorstore implementations. The key methods for working with vector stores are add_documents, delete, and similarity_search.

Sources:

docs\langchain_concept_vectorstores.txt
docs\langchain_concept_vectorstores.txt
docs\langchain_concept_vectorstores.txt
docs\langchain_concept_retrieval.txt

The assistant consistently provided relevant and accurate answers to technical questions, with clear source attribution. The use of FAISS enabled fast retrieval even as the document corpus grew. The Streamlit interface made the system accessible and easy to use for developers.

The Code Repository for this RAG Assistant : RAG Assistant

Conclusion

The RAG assistant effectively augments LLMs with a custom knowledge base, enabling accurate, source-attributed answers to domain-specific questions. The modular design, leveraging LangChain, OpenAI, and FAISS, allows for straightforward extension to other domains or document types. This project demonstrates the practical value of RAG systems for technical support and documentation search, and serves as a template for similar applications in other fields.