
The volume of academic literature is growing exponentially, making it difficult for researchers to stay current. The Academia Analyzer Agent solves this problem by providing an automated, multi-agent system that ingests raw research documents (PDFs/Text) and synthesizes them into structured, high-value executive summaries.
This project serves as the Module 2 MVP deliverable, demonstrating mastery over Multi-Agent Orchestration and Tool Integration.
Core Objectives:
The system utilizes a sequential pipeline where three specialized agents collaborate to process the document.
| Agent | Role & Responsibility | Key Tool Integration |
|---|---|---|
| 1. DocumentIngestorAgent | Ingestion & Indexing. Reads local PDF/Text files, chunks the content, and builds the vector search index. | PDF Reader Tool (pypdf, TextLoader) & RAG Retriever (FAISS). |
| 2. ThesisExtractorAgent | Data Extraction. Identifies the paper's core hypothesis and technical keywords using NLP. | Keyword Extractor Tool (nltk). |
| 3. InsightSynthesizerAgent | Final Synthesis. Generates the Executive Summary and Key Findings using the RAG context. | LLM API (OpenRouter) & Pydantic Parser (Structured Output). |
InsightSummary JSON object.To ensure high technical quality, the system employs rigorous methodology for document processing and output generation.
The RAG index is configured to handle complex academic language:
| Setting | Value | Rationale |
|---|---|---|
| Text Chunk Size | 1500 tokens | Larger chunk size chosen to capture full academic arguments and methodology sections without breaking context. |
| Text Chunk Overlap | 250 tokens | Ensures continuity between pages and paragraphs. |
| Embedding Model | HuggingFaceEmbeddings(all-MiniLM-L6-v2) | Fast, efficient local embedding model suitable for scientific text. |
The complete, working code is available in the linked GitHub repository.
Code Repository: https://github.com/maddiravi/academia-analyzer-agent
The system is delivered via a Streamlit web application (app.py), allowing users to easily upload research papers via a drag-and-drop interface and view the analysis in real-time.
git clone [https://github.com/maddiravi/academia-analyzer-agent](https://github.com/maddiravi/academia-analyzer-agent) cd academia-analyzer-agent pip install -r requirements.txt
.env file with your OPENROUTER_API_KEY.python -m streamlit run app.py



Future work will focus on:
To ensure the system is safe for deployment, we implemented multiple layers of protection:
.pdf, .txt only) to prevent processing malicious file types.safety.py tool strips potential script injection patterns (e.g., HTML tags) from the AI's output before it is rendered on the frontend..env files and are never hardcoded. A pre-flight HealthCheck ensures the API connection is secure and valid before any user data is processed.BY RAVI KANTH MADDI