HomePublicationsCertificationsCompetitionsContributors
Start publication
HomePublicationsCertificationsCompetitionsContributors

Table of contents

Code

Datasets

Files

AboutDocsPrivacyCopyrightContactSupport
© Ready Tensor, Inc.
Back to publications
Jun 13, 2025●19 reads●MIT License
CertifiedCertifiedunder the RAG-Based AI Assistant module in the Agentic AI Essentials program.

AskImmigration: Navigate U.S. Immigration with an AI Assistant

  • AAIDC2025
  • AI
  • Chroma
  • Firestore
  • Groq
  • Immigration
  • LangChain
  • Python

Table of contents

RAG
  • dunkygeoffrey39
    Geoffrey Duncan Opiyo
  • d
    Deo Mugabe
  • o
    @okumujustine01
  • arinda.hillary
    Hillary Arinda

Authors: Geoffrey Duncan Opiyo, Hillary Arinda, Justine Okumu, Deo Mugabe


AskImmigrate-Chat-1.png


AI-Powered Guide for Your U.S. Legal Immigration Journey

Introduction

U.S. immigration information is dense, scattered across lengthy PDFs, policy manuals, and form instructions that change over time. Applicants, families, workers, and students must translate legal jargon into actionable steps—often without reliable, up-to-date guidance.

👥 Target Audience

  • Who: Immigrants, attorneys, advocates, paralegals, and the general public.
  • Scale: In 2023, the United States was home to 47.8 million immigrants, and roughly three out of four held lawful status (naturalized citizens, green-card holders, or temporary visa holders). This is a massive population making continual status, compliance, and benefit decisions.
  • Needs: Guidance on filing, eligibility, and understanding USCIS procedures. People don’t want another static FAQ; they need precise, document-grounded answers surfaced instantly, with clear wording and traceable source context.
  • Use Cases:
    • "What is the fee for I-130?"
    • "Can I travel while my I-485 is pending?"

Solution

AskImmigration delivers retrieval-augmented, citation-backed responses by extracting from vetted official PDFs and structured form data—keeping answers focused, current, and explainable.


🚀 GitHub Repository ↗

Methodology

1. Ingestion
Source PDFs and JSON files are parsed and split into semantically coherent text chunks (size tuned for LLM context efficiency and retrieval granularity).

2. Embedding Generation
Each chunk is converted to a dense vector using a HuggingFace MiniLM embedding model (dimension = 768).

3. Indexing
Resulting embeddings (with associated metadata: source doc id, chunk id, offsets) are stored in a Chroma vector database for approximate nearest neighbor (ANN) similarity search.

4. Prompt Construction
At query time, a LangChain pipeline assembles:

  • System prompt: domain / task instructions and guardrails.
  • User prompt: the raw user question plus selectively injected retrieved context chunks (citation ordering preserved).

5. Query and Retrieval-Augmented Generation
A semantic search (k-nearest vectors) retrieves top candidate chunks. Retrieved context + prompts are passed to a Groq-hosted LLM for answer synthesis. Post-processing enforces format (e.g., citations, concise style).

Performance

StageMetric / ThroughputNotes
Ingestion~100 documents/minuteMeasured on moderate commodity hardware during bulk load.
Embedding Dimension768 floats per chunkMiniLM vector size.
Query Latency<500 ms per lookup + responseOn a GPU-enabled machine (retrieval + LLM call).

For full setup and examples, see our README.


System Overview

AskImmigrate-3.jpeg

  • Chat Interface: A simple Python CLI or web form.
  • Prompt Builder: Merges role, tone, and rules into one prompt.
  • Vector Store: Chroma holds all document embeddings.
  • AI Engine: LangChain passes your question and context to Groq.
  • Logging: All questions and answers go into Firestore.

Key Features

✅ Document-grounded answers
Pulls directly from well curated PDFs and JSON files from trusted public sources, so every response is backed by real, cited sources.

✅ Safety rules
Built-in guardrails keep the assistant on topic and prevent it from sharing unsafe or off-scope information.

✅ Multi-turn chat
Remembers your previous questions and answers, letting you follow up without losing context.

✅ Configurable prompts
Adjust the assistant’s role, tone, and boundaries in a simple YAML file—no code changes needed.

✅ Full audit trail
Saves every question and answer in Firestore for easy review and compliance tracking.


Tech Stack

PartTech
Frontend/CLIReact/Python
Prompt ConfigYAML
Vector SearchChroma
EmbeddingsHugging Face Sentence Transformers
LLM EngineLangChain + Groq
Data StorageFirestore
Utilitiescli.py, load_data.py

Why AskImmigration

  • 🧭 AskImmigration helps you navigate the U.S. immigration process with clear, direct answers — no legal jargon or confusion.

  • 📄 It uses official, up-to-date data from USCIS forms and government policies to ensure accuracy and reliability.

  • Whether you're applying for a visa, adjusting your status, or planning for citizenship, AskImmigration is here to support you every step of the way.


Potential Feature Enhancements

  • 🌐 Multilingual Support

    Allow users to interact in multiple languages (e.g. Spanish, Mandarin, Arabic) to make the tool accessible to a broader audience.

  • ✍️ Form Assistant

    Help users fill out common USCIS forms by guiding them section-by-section with plain-language explanations and example answers.

  • 📄 Document Uploader

    Let users upload their USCIS notices or forms. The assistant could analyze them and provide insights or next steps based on the content.


How to Run

Installation

Clone the repository:

git clone https://github.com/okumujustine/AskImmigrate.git cd AskImmigrate

Install dependencies:

uv pip install -r requirements.txt

Environment Setup

Create a .env file in the project root and add your Groq key:

GROQ_API_KEY=your-groq-api-key

Ensure JSON and PDF source files are accessible on disk.

Usage

CLI Interface

Ingest documents and JSON:

python embed_documents.py

Launch the terminal chat with a question:

python cli.py --question "What is the F1 visa?"

List all previous chat sessions:

python cli.py --list_sessions

Continue a past session (replace <session_id> with an ID from the list):

python cli.py --session_id <session_id> --question "Next question text"

React Application

Run the back-end server:

uvicorn app.api:app --reload --port 9000

Navigate to the frontend directory:

cd frontend

Install frontend dependencies:

npm install

Start the development server:

npm run dev

Open your browser at:

http://localhost:5173

to chat with AskImmigrate in the web UI.


📂 License

🔐 License: This project is licensed under the MIT License


🧠 Conclusion

AskImmigration transforms a process that often feels overwhelming into one that's fast, clear, and empowering. It takes dense, scattered immigration texts and delivers grounded answers you can understand and act on immediately.

You get clarity instead of jargon, sources instead of speculation, and instant access instead of endless searching—so you can stay focused on your path, not the paperwork.

This assistant doesn’t replace legal counsel, but it prepares you to ask sharper questions, catch issues early, and move forward with greater confidence.

As the platform grows—with multilingual support, guided forms, smarter updates, and stronger evaluation—it will stay true to its mission: reduce friction, lower anxiety, and raise trust in every immigration step.

Ask clearly. Understand instantly. Decide with confidence.

Table of contents

Your publication could be next!

Join us today and publish for free

Sign Up for free!

Table of contents

Code

  • AskImmigrate

Code

  • AskImmigrate