This tool converts YouTube videos into text using Whisper for speech-to-text and extracts summaries with Llama, automatically saving the results to a Notion database.
/STTnotion
|-- config.py
|-- your_ideal_model.gguf
|-- downloaded_audio
| `-- yt_vid_you_download.wav
|-- main.py
|-- notion_handler.py
|-- requirements.txt
|-- whisper_handler.py
`-- youtube_handler.py
To run this tool, you need to:
streamlit
, youtube_dl
, whisper
, notion-client
Create a .env
file to store your Notion API key and database ID:
NOTION_TOKEN=your_notion_api_key_here
NOTION_DATABASE_ID=your_notion_database_id_here
Install the required libraries:
pip install -r requirements.txt
Ensure you have FFmpeg installed.
Set up your .env
file with the Notion API key and database ID.
Download your ideal language model(.gguf file), like LLama. Put the model file under STTnotion folder.
Edit MODEL_PATH in llama_handler.py.
Run the program:
streamlit run main.py
To check if FFmpeg is installed, open your command line (terminal) and type:
ffmpeg -version
If FFmpeg is installed, you will see version information. If not, you’ll receive an error message.
Windows:
Use Chocolatey (package manager):
choco install ffmpeg
Or download from the official website.
macOS:
Use Homebrew:
brew install ffmpeg
Linux (Ubuntu/Debian):
sudo apt update sudo apt install ffmpeg
After installation, run ffmpeg -version
again to confirm success.
If you encounter an error such as Unable to extract uploader id
, you may need to update yt-dlp
:
python3 -m pip install --force-reinstall <https://github.com/yt-dlp/yt-dlp/archive/master.tar.gz>
Then, run:
yt-dlp URL
Ensure you import it correctly in your project:
import yt_dlp as youtube_dl
Create a Notion integration to obtain the API key.
Add the integration to your Notion database to retrieve the database ID.
To find your Notion database ID, locate the URL of your database:
https://www.notion.so/<long_hash_1>?v=<long_hash_2>
<long_hash_1>
is your database ID.
If you encounter errors such as “Title is not a property that exists” or “Content is not a property that exists”, ensure the properties are correctly defined in your Notion database. Update the notion_handler.py
with the correct property names:
Original Code:
def create_notion_page(text): new_page = notion.pages.create( parent={"database_id": NOTION_DATABASE_ID}, properties={ "Title": {"title": [{"text": {"content": "YouTube Transcription"}}]}, "Content": {"rich_text": [{"text": {"content": text[:2000]}}]}, # Notion API limit }, )
Updated Code:
page_properties = { "Name": { "title": [ { "text": { "content": file_name } } ] }, "Tags": { "multi_select": [ {"name": "content"} ] } } new_page = notion.pages.create( parent={"database_id": NOTION_DATABASE_ID}, properties=page_properties )
This refined version includes clear instructions, troubleshooting tips, and code examples that will help users understand and effectively use your tool.