Interview Status Chatbot: A RAG-Powered HR Assistant for Job Application Tracking

Add a heading (1).jpg

Abstract

The Interview Status Chatbot is a Retrieval Augmented Generation (RAG)-powered HR assistant designed to streamline job application tracking for both applicants and organizations. Built using Python, Pinecone, and the Groq LLM API, the chatbot retrieves real-time applicant data from a CSV file and provides personalized updates on application status, interview scheduling, and logistics. It features a user-friendly interface built with Chainlit, allowing applicants to simply enter their name and the position they applied for to receive structured, human-like responses. The project demonstrates how modern AI techniques like RAG can be effectively applied to automate HR processes, improve communication transparency, and enhance the candidate experience—all while ensuring data privacy and scalability.

Introduction

Interview Status Chatbot is a professional HR assistant chatbot designed to streamline the job application experience by allowing applicants to check their status through a simple natural language interface. The chatbot provides real-time updates on application status, interview schedules, and locations — eliminating the need for manual communication from HR teams.

This solution is built using the Retrieval-Augmented Generation (RAG) paradigm and demonstrates the practical application of LLMs in human resources workflows. It empowers both organizations and applicants by automating status retrieval while ensuring personalized, privacy-conscious responses.

Methodology

Interview Status Chatbot leverages a Retrieval-Augmented Generation (RAG) architecture combining Pinecone for vector search, Groq’s LLM API for natural language generation, and Chainlit for an interactive web interface. Applicant data is embedded and stored as vectors, enabling real-time semantic search and personalized HR-style responses without compromising privacy.

Architecture

The chatbot is built using a Retrieval-Augmented Generation (RAG) pipeline, which combines semantic search with generative AI to retrieve accurate applicant data and respond with professional HR messages.

Workflow Overview

1. Data Ingestion

Applicant data is sourced from a CSV file.

2. Embedding & Indexing

Each applicant entry is converted into vectors using sentence-transformers/all-MiniLM-L6-v2 from Hugging Face and stored in a Pinecone index.

3. User Query Handling

When a user submits a prompt (e.g., "My name is Abdullah and I applied for the AI Engineer position"), the query is embedded and matched against the index.

4. Retrieval & Response Generation

The closest match is retrieved from Pinecone.
Context is passed to the Groq LLM to generate a personalized response.

5. User Interface

The response is displayed to the user through a Chainlit web interface running locally.

Diagram

ChatGPT Image Jun 9, 2025, 05_00_19 PM.png

Tools & Technologies

Tool / Library / Technology	Purpose
Python	Core programming language
Chainlit	Web UI for LLM-powered chat interfaces
Pinecone	Vector database for storing and querying embeddings
Groq Cloud	High-speed LLM inference for response generation using `llama-3.3-70b-versatile`
dontenv	Securely load API keys from environment variables
pandas	Load and preprocess tabular data
Hugging Face	Generate text embeddings for semantic search using `sentence-transformers/all-MiniLM-L6-v2`
LangChain	Orchestrates RAG pipeline components

Experiments

Test Setup

The project was tested using a sample dataset containing 200 above rows of applicant information sourced from a publicly shared Google Sheet. The test prompts followed the format:

"My name is [Name] and I have applied for [Position]"

The system was evaluated through its web interface built on Chainlit, with Groq Cloud handling response generation and Pinecone serving as the vector database for embedding retrieval.

Test Cases & Results

To evaluate the model’s robustness and accuracy, a diverse set of representative prompts was used. These included both correct and incorrect name-role pairings, as well as edge cases to test the model’s tolerance to input noise and variation.

Valid Inputs

Prompt: My name is Abdullah and I have applied for AI Engineer position.

Expected Outcome: Matched — The name and role pair is valid and present in the dataset.

Actual Result:
Screenshot from 2025-06-09 17-19-34.png

Invalid Inputs

Prompt: My name is Bilal Shah and I have applied for Software Engineer position.

Expected Outcome: Mismatched — Although "Bilal Shah" exists, he applied for the Data Scientist role, not Software Engineer.

Actual Result:
Screenshot from 2025-06-09 17-20-28.png

Prompt: My name is Sarah and I applied for Content Writer role.

Expected Outcome: Not Found — "Sarah" does not exist in the dataset.

Actual Result:
Screenshot from 2025-06-09 17-21-20.png

Edge Cases

Typo Handling

Prompts such as “My name is Abudllah and I applied for AI Engineer position.” were tested to evaluate resilience against minor spelling errors.

Case Insensitivity

Inputs with varying capitalization like “my name is abdullah…” or “MY NAME IS ABDULLAH…” were also tested to ensure consistent behavior.

These test cases helped verify that the system accurately retrieves valid records while rejecting incorrect or unseen combinations. By incorporating both standard and noisy inputs, we ensured the model performs reliably under realistic user behavior.

Evaluation Criteria

The chatbot’s performance was assessed based on the following:

Relevance: Ability to accurately retrieve the correct applicant information corresponding to the input query.
Fluency: Quality and professionalism of the generated response from the Groq model.
Speed: The response time observed during query processing, highlighting the efficiency of Groq Cloud and Pinecone integration.

Results

The system’s performance was evaluated based on the following key metrics

Accuracy

The chatbot successfully retrieved correct applicant details for the valid query that was “Abdullah – AI Engineer”. Invalid queries like the one for “Sarah” and "Bilal Shah" resulted in appropriate responses indicating no matching records found, demonstrating effective handling of unknown applicants.

Response Quality

Generated responses were fluent, professional, and personalized, clearly providing interview status, dates, and locations when available. The use of Groq Cloud’s LLM ensured that answers were contextually relevant and human-like.

Robustness

The system demonstrated resilience against input variations, correctly interpreting prompts with minor typos and differences in casing without significant loss in accuracy.

Performance

Average response time per query was under 15 seconds, showing fast retrieval and generation through Pinecone embeddings and Groq Cloud API.

Conclusion

The Interview Status Chatbot showcases the practical application of Retrieval-Augmented Generation (RAG) for automating HR communication in a personalized, efficient, and scalable manner. By integrating Groq Cloud's LLM capabilities with Pinecone for semantic search and Chainlit for user interaction, the system delivers accurate, context-aware responses to job applicants.

The project validates how small-scale RAG systems can be rapidly deployed using structured data sources like CSV files, enabling businesses to improve candidate experience without heavy infrastructure. Future enhancements will include real-time data updates on every query.