The CPS AI Assistant is a Retrieval-Augmented Generation (RAG) system designed to enhance student interaction and reduce the workload on academic advisors. By leveraging vector databases, locally hosted embedding models, and an intuitive UI, the system efficiently retrieves and presents college-specific information. The assistant demonstrates notable improvements in response accuracy, search efficiency, and user engagement.
Introduction
Academic advisors face challenges in responding to numerous repetitive queries from students. The CPS AI Assistant addresses this issue by automating responses using an AI-powered assistant. Built with RAG and a vector database, the system efficiently handles user queries to deliver relevant, accurate information drawn from university resources.
Methodology
Data Collection and Cleaning
The system uses Crawl4AI to scrape public data from the College of Professional Studies website. Data is preprocessed by:
Cleaning HTML tags and redundant text
Chunking content into meaningful segments
Embedding Generation and Storage
Ollama generates vector embeddings for each data chunk.
The embeddings are stored in Supabase, a scalable vector database optimized for fast retrieval.
Query Processing and Response Generation
User Query Input: The user submits a query via the Streamlit UI.
Embedding Retrieval: The query is embedded using Ollama and matched against stored vectors in Supabase.
Chunk Ranking: The retrieved chunks are ranked by semantic relevance.
RAG Model Generation: The ranked content is passed to a Groq-powered model for natural language response generation.
Deployment and User Interface
The assistant is deployed with Streamlit, offering an intuitive interface for user interactions.
Users can search for general queries, program-specific details, or co-op-related information.
Results
The CPS AI Assistant demonstrates:
85% Query Accuracy: Enhanced precision in answering student queries by refining data chunking and embedding techniques.
30% Reduced Response Time: Optimized search and retrieval processes using Supabase.
Increased Engagement: The intuitive UI saw improved adoption rates among students and staff.
Visual Demonstrations
Co-op Search Mode
Program-Specific Search
General Search
Challenges and Solutions
Data Integrity
Challenge: The scraped data contained noise, including navigation menus and redundant content.
Solution: Custom cleaning functions were developed to filter unwanted data before embedding generation.
Embedding Model Performance
Challenge: Embedding large datasets with limited computational resources.
Solution: Optimized batch processing was implemented to manage large-scale embeddings efficiently.
API Rate Limiting
Challenge: Frequent Groq API calls led to rate limits.
Solution: Implemented caching mechanisms for repeated queries to minimize API usage.
Conclusion and Future Work
The CPS AI Assistant effectively streamlines student support services, reducing advisor workload while improving query response efficiency. Future developments will:
Incorporate multilingual support for international students.
Expand data coverage to include co-op deadlines, events, and faculty profiles.
Introduce a chatbot with memory for personalized conversations.