
This project presents an AI-powered text summarization tool, LLaMA Text Summarizer, which leverages the LLaMA 2 large language model via the Ollama framework. The system provides a simple and efficient way to summarize long-form text inputs entirely offline. It combines a responsive Streamlit-based frontend with a FastAPI backend to handle user inputs and model communication, offering a smooth user experience for real-time summarization tasks.
The application is structured into two main components:
Frontend: Streamlit (Python)
Backend: FastAPI
Model Execution: Ollama (for running LLaMA 2 locally)
Language Model: LLaMA 2
HTTP Communication: requests Python library
Clone the repository and set up a Python virtual environment.
Install dependencies via pip or requirements.txt.
Install Ollama from https://ollama.com and pull the LLaMA 2 model locally.
Launch the FastAPI server and Streamlit UI separately.
python -m venv venv # macOS/Linux: source venv/bin/activate # Windows (PowerShell): .\venv\Scripts\Activate.ps1 # Windows (CMD): .\venv\Scripts\activate.bat
pip install fastapi uvicorn streamlit requests python-multipart
ollama pull llama2
uvicorn backend.main:app --reload
Start the Streamlit Frontend (in a new terminal)
streamlit run frontend/app.py
The LLaMA Text Summarizer has demonstrated strong performance in summarizing lengthy textual content in real-time, all while running entirely offline. Key advantages include:
Privacy-preserving inference (no external API calls)
Low-latency summarization using local LLaMA 2 models
Cross-platform compatibility through Python-based setup
User-friendly experience with a clean UI/UX
This tool is suitable for individuals or organizations needing quick, offline summarization of documents, articles, and long-form content.