Automatically transcribe and summarize lecture recordings completely on-device using AI.
Install Ollama.
Create a virtual Python environment:
python3 -m venv .venv
Activate the virtual environment:
source .venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Edit lecsum.yaml:
| Field | Default Value | Possible Values | Description |
|---|---|---|---|
whisper_model | "base.en" | Whisper model name | Specifies which Whisper model to use for transcription |
ollama_model | "llama3.1" | Ollama model name | Specifies which Ollama model to use for summarization |
prompt | "Summarize: " | Any string | Instructs the large language model during the summarization step |
Run the Ollama server:
ollama serve
In a new terminal, run:
./lecsum.py -c [CONFIG_FILE] [AUDIO_FILE]
Use any file format supported by Whisper (mp3, mp4, wav, webm, etc.).
To start the lecsum server in a development environment, run:
fastapi dev server.py
Automated testing is performed using the pytest framework:
pytest