AI Hiring Assistant RAG Application

A production-ready Retrieval-Augmented Generation (RAG) starter built for evaluating job candidates via resumes and job descriptions. Integrates embeddings, FAISS, and LLMs for semantic search and candidate ranking.

🌍 Problem Statement

Hiring processes often require reviewing hundreds of resumes against complex job descriptions, which makes the process both time-consuming and subjective. Recruiters spend an average of 6–7 seconds scanning each resume, and at scale, this leads to inefficiency and potential loss of top talent.

Traditional Applicant Tracking Systems (ATS) rely heavily on keyword matching, missing the deeper semantic meaning in resumes and job descriptions. As a result, qualified candidates are overlooked, bias can creep into decisions, and the workload for HR teams continues to grow.

This project introduces an AI-powered hiring assistant that leverages RAG pipelines to semantically match resumes and job descriptions. It helps recruiters quickly identify the most relevant candidates, reduce human bias, and make the hiring process more transparent and data-driven.

Stakeholders: HR managers, recruiters, job candidates, and organizations scaling their workforce.

📚 Related Work & Gaps

Existing HR technology tools, such as Applicant Tracking Systems (ATS) and LinkedIn Recruiter, still rely largely on keyword or rule-based filtering. These systems fail to understand the true semantic meaning of text and struggle with niche or domain-specific roles like AR/VR developers or MetaHuman specialists.

Our approach bridges this gap by combining FAISS vector stores, sentence-transformer embeddings, and LLM reasoning within a transparent RAG pipeline. This allows for a deeper, more accurate understanding of resumes and job descriptions while maintaining reproducibility and explainability in the evaluation process.

⚙️ Features

The system provides a fully operational FastAPI backend exposing endpoints for ingestion and querying. It uses a transparent RAG pipeline — breaking resumes into chunks, embedding them into vector space, indexing with FAISS, retrieving relevant data, and generating responses with an LLM.

It supports both local and cloud embedding models (e.g., sentence-transformers or Gemini), persistent FAISS indexes, and bulk data ingestion for resumes and job descriptions. A clean API interface is provided through /ingest, /ingest_resumes, and /ask routes, with documentation available in Swagger UI (/docs) and ReDoc (/redoc).

Quickstart

1) Clone & create environment

git clone <your-repo-url>
cd readytensor_rag_starter
python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt

2) Configure environment

Copy .env.example to .env and set values:

cp .env.example .env

Key variables:

EMBEDDING_PROVIDER — "local" or "gemini"
LLM_PROVIDER — "gemini" (or local/testing)
EMBEDDING_MODEL — e.g., sentence-transformers/all-MiniLM-L6-v2
MODEL_NAME — e.g., gemini-2.5-flash-lite
GEMINI_API_KEY — if using Gemini
CHUNK_SIZE, CHUNK_OVERLAP, TOP_K, MAX_TOKENS, TEMPERATURE

3) Ingest resumes and job descriptions

Bulk ingestion:

python -m rag.ingest --source data/resumes_and_jd

Single file ingestion:

curl -X POST "http://localhost:8000/ingest" \
-H "Content-Type: application/json" \
-d '{"file_path":"data/resumes_and_jd/jake.pdf"}'

This will build/persist the FAISS index and metadata for semantic search.

4) Run the API

uvicorn app.main:app --reload

Swagger UI: http://localhost

/docs
ReDoc: http://localhost

/redoc

5) Ask a question

curl -X POST "http://localhost:8000/ask" \
-H "Content-Type: application/json" \
-d '{"question":"Rank candidates for the MetaHuman Specialist role","top_k":4}'

Project Structure

readytensor_rag_starter/
├── app/
│   └── main.py                     # FastAPI app
├── rag/
│   ├── ingest.py                   # Resume/job ingestion
│   ├── pipeline.py                 # RAG pipeline logic
│   ├── prompt.py                   # Prompts for candidate evaluation
│   └── utils.py                    # Helpers (PDF parsing, embeddings, vector store)
├── data/
│   └── resumes_and_jd/             # PDFs, job descriptions, candidates.csv
├── storage/                        # generated FAISS index + metadata (runtime)
├── .env_example
├── .gitignore
├── requirements.txt
└── README.md

API Endpoints

Method	Endpoint	Description
POST	/ask	Ask a query or rank candidates
POST	/ingest	Ingest a single resume or job description
POST	/ingest_resumes	Bulk ingest all resumes

Usage Tips

Place all resumes and job descriptions in data/resumes_and_jd/
Run python -m rag.ingest --source data/resumes_and_jd before querying
Start the API server with uvicorn app.main:app --reload
Query /ask to evaluate or rank candidates

🚧 Limitations & Trade-offs

This experimental release focuses on functionality and transparency. The system does not yet include explicit bias mitigation mechanisms, relying instead on the neutrality of the underlying embeddings and language model.

Performance depends on the chosen embedding provider — local models like sentence-transformers offer flexibility and privacy, while Gemini-based embeddings may yield richer results. The project currently operates as an API-only backend without a recruiter-facing dashboard, and large-scale benchmarking beyond 10,000 resumes remains a future goal.

⚡ Performance & Scalability

Initial testing with around twenty resumes and one job description showed indexing speeds of 2–3 seconds per document on a CPU, with query latency averaging 1–2 seconds using local embeddings. The system requires about 2GB of RAM for small-scale experiments, while GPU acceleration is recommended for larger workloads.

FAISS can scale to millions of vectors, making this design suitable for enterprise-level recruitment data once full benchmarks and optimizations are performed.

🧰 Maintenance & Support

Version v0.1.0 is experimental and updated manually as new features are added. The system is designed for clarity and reproducibility — all secrets are stored in .env, dependency versions are pinned, and FAISS indexing is deterministic given identical inputs.

For questions, improvements, or bug reports, users are encouraged to open issues on GitHub.

Visual Demonstration

*
*

Contact

*Maintainer: Razwi M K
*GitHub: @razwi1
*Email: razwimkofficial@gmail.com

Requirements

fastapi
uvicorn[standard]
python-multipart
pydantic
openai
langchain
tiktoken
pandas
numpy
python-docx
PyPDF2
sentence-transformers
faiss-cpu

Make Targets

make install       # pip install -r requirements.txt
make ingest        # bulk ingest resumes/job descriptions
make dev           # start uvicorn server
make test          # run tests
make format        # black + isort formatting

Security & Reproducibility

Keep secrets in .env (never commit). .env_example provided.
FAISS indexes are deterministic given the same inputs.
Pin dependency versions in requirements.txt for reproducibility.

License

MIT License

AI Hiring Assistant RAG Application

Table of contents

A production-ready Retrieval-Augmented Generation (RAG) starter built for evaluating job candidates via resumes and job descriptions. Integrates embeddings, FAISS, and LLMs for semantic search and candidate ranking.

🌍 Problem Statement

📚 Related Work & Gaps

⚙️ Features

Quickstart

1) Clone & create environment

2) Configure environment

3) Ingest resumes and job descriptions

4) Run the API

5) Ask a question

Project Structure

API Endpoints

Usage Tips

🚧 Limitations & Trade-offs

⚡ Performance & Scalability

🧰 Maintenance & Support

Visual Demonstration

*
*

Contact

Requirements

Make Targets

Security & Reproducibility

License

Table of contents

Code

Code

Table of contents

A production-ready Retrieval-Augmented Generation (RAG) starter built for evaluating job candidates via resumes and job descriptions. Integrates embeddings, FAISS, and LLMs for semantic search and candidate ranking.

🌍 Problem Statement

📚 Related Work & Gaps

⚙️ Features

Quickstart

1) Clone & create environment

2) Configure environment

3) Ingest resumes and job descriptions

4) Run the API

5) Ask a question

Project Structure

API Endpoints

Usage Tips

🚧 Limitations & Trade-offs

⚡ Performance & Scalability

🧰 Maintenance & Support

Visual Demonstration

* *

Contact

Requirements

Make Targets

Security & Reproducibility

License

Table of contents

Code

Code

*
*