Hiring processes often involve reviewing hundreds of resumes against complex job descriptions.
This manual process is time-consuming, error-prone, and subjective, leading to:
Stakeholders: HR managers, recruiters, job candidates, and organizations scaling their workforce.
Impact: On average, recruiters spend 6ā7 seconds scanning a resume. Scaling this to thousands of applicants results in inefficiency and potential loss of talent. Existing Applicant Tracking Systems (ATS) rely heavily on keyword matching, which fails to capture semantic meaning and contextual fit.
This project introduces an AI-powered assistant that leverages RAG pipelines to match resumes and job descriptions semantically, providing ranked candidate recommendations.
Our solution addresses these gaps by combining FAISS vector stores, sentence-transformer embeddings, and LLM reasoning to provide semantic candidate ranking.
/ingest
, /ingest_resumes
, and /ask
routessentence-transformers
or Gemini embeddings).env
/docs
and ReDoc /redoc
supportgit clone <your-repo-url> cd readytensor_rag_starter python -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate pip install -r requirements.txt
Copy .env.example
to .env
and set values:
cp .env.example .env
Key variables:
EMBEDDING_PROVIDER
ā "local" or "gemini"LLM_PROVIDER
ā "gemini" (or local/testing)EMBEDDING_MODEL
ā e.g., sentence-transformers/all-MiniLM-L6-v2
MODEL_NAME
ā e.g., gemini-2.5-flash-lite
GEMINI_API_KEY
ā if using GeminiCHUNK_SIZE
, CHUNK_OVERLAP
, TOP_K
, MAX_TOKENS
, TEMPERATURE
Bulk ingestion:
python -m rag.ingest --source data/resumes_and_jd
Single file ingestion:
curl -X POST "http://localhost:8000/ingest" \ -H "Content-Type: application/json" \ -d '{"file_path":"data/resumes_and_jd/jake.pdf"}'
This will build/persist the FAISS index and metadata for semantic search.
uvicorn app.main:app --reload
Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc
curl -X POST "http://localhost:8000/ask" \ -H "Content-Type: application/json" \ -d '{"question":"Rank candidates for the MetaHuman Specialist role","top_k":4}'
readytensor_rag_starter/
āāā app/
ā āāā main.py # FastAPI app
āāā rag/
ā āāā ingest.py # Resume/job ingestion
ā āāā pipeline.py # RAG pipeline logic
ā āāā prompt.py # Prompts for candidate evaluation
ā āāā utils.py # Helpers (PDF parsing, embeddings, vector store)
āāā data/
ā āāā resumes_and_jd/ # PDFs, job descriptions, candidates.csv
āāā storage/ # generated FAISS index + metadata (runtime)
āāā .env_example
āāā .gitignore
āāā requirements.txt
āāā README.md
Method | Endpoint | Description |
---|---|---|
POST | /ask | Ask a query or rank candidates |
POST | /ingest | Ingest a single resume or job description |
POST | /ingest_resumes | Bulk ingest all resumes |
data/resumes_and_jd/
python -m rag.ingest --source data/resumes_and_jd
before queryinguvicorn app.main:app --reload
/ask
to evaluate or rank candidates*Tested locally with ~20 candidate resumes and 1 job description.
*Indexing: ~2ā3s per document on CPU.
*Query latency: ~1ā2s with local embeddings.
*Resource needs: ~2GB RAM minimum; GPU recommended for large datasets.
*Scalability: FAISS supports millions of vectors, but not yet benchmarked for >10k resumes in this repo.
*Current Version: v0.1.0 (experimental).
*Updates: Manual, planned as new features are added.
*Support: Open an issue on GitHub Issues.
*Maintainer: Razwi M K
*GitHub: @razwi1
*Email: razwimkofficial@gmail.com
make install # pip install -r requirements.txt make ingest # bulk ingest resumes/job descriptions make dev # start uvicorn server make test # run tests make format # black + isort formatting
.env
(never commit). .env_example
provided.requirements.txt
for reproducibility.MIT License