text-summarizer-llama: AI-powered text summarizer using LLaMA 2, FastAPI and Streamlit

📄Abstract

This project presents an AI-powered text summarization tool, LLaMA Text Summarizer, which leverages the LLaMA 2 large language model via the Ollama framework. The system provides a simple and efficient way to summarize long-form text inputs entirely offline. It combines a responsive Streamlit-based frontend with a FastAPI backend to handle user inputs and model communication, offering a smooth user experience for real-time summarization tasks.

🔬 Methodology

System Architecture:

The application is structured into two main components:

Frontend: Built with Streamlit, this interface allows users to input long texts and receive summarized outputs with a single click.
Backend: Developed with FastAPI, the backend receives text data, sends it to the LLaMA 2 model through Ollama, and returns the summarized version.

Technology Stack:

Frontend: Streamlit (Python)
Backend: FastAPI
Model Execution: Ollama (for running LLaMA 2 locally)
Language Model: LLaMA 2
HTTP Communication: requests Python library

Setup Workflow:

Clone the repository and set up a Python virtual environment.
Install dependencies via pip or requirements.txt.
Install Ollama from https://ollama.com and pull the LLaMA 2 model locally.
Launch the FastAPI server and Streamlit UI separately.

Summarization Flow:

User inputs text via the Streamlit UI.
The text is sent as a POST request to the FastAPI backend.
Backend communicates with the locally running LLaMA 2 model via Ollama.
Model generates a summary which is returned to and displayed on the frontend.

🛠️ Setup & Execution

Set Up a Virtual Environment

python -m venv venv
# macOS/Linux:
source venv/bin/activate
# Windows (PowerShell):
.\venv\Scripts\Activate.ps1
# Windows (CMD):
.\venv\Scripts\activate.bat

Install Dependencies

 pip install fastapi uvicorn streamlit requests python-multipart

Install Ollama and Pull the LLaMA 2 Model

ollama pull llama2

Run the Application
Start the FastAPI Backend

uvicorn backend.main:app --reload

Start the Streamlit Frontend (in a new terminal)

streamlit run frontend/app.py

📊 Results

The LLaMA Text Summarizer has demonstrated strong performance in summarizing lengthy textual content in real-time, all while running entirely offline. Key advantages include:

Privacy-preserving inference (no external API calls)
Low-latency summarization using local LLaMA 2 models
Cross-platform compatibility through Python-based setup
User-friendly experience with a clean UI/UX
This tool is suitable for individuals or organizations needing quick, offline summarization of documents, articles, and long-form content.

🔗 Resources

💻 GitHub Repository: https://github.com/dhiaselmi1/text-summarizer-llama
🦙 Ollama: https://ollama.com