A tool for transcribing audio and video files in multiple formats. Automatically converts unsupported formats for compatibility with Whisper.
MultiFormatTranscriber is a program designed to simplify the process of transcribing audio and video files using the Whisper transcription model, owned by OpenAI.
The added value of this program lies in the fact that it creates a workflow whereby, if the user has to transcribe several audio and video files, even those with extensions not supported by the Whisper model, the program will automatically start each transcription, one after the other, using computing resources and time more efficiently, and will attempt to convert files not originally supported by Whisper into a format supported by the model.
The program was born from my need to transcribe long university lectures in a specific order to be able to prepare for an exam in time. It often happened that, after starting a transcription, it would end in the middle of the night, so the computer would remain switched on without performing any useful tasks until the next morning, when I would start a new transcription. To avoid this waste of electricity and inefficient use of computational resources, I wrote this little program that I decided to make public in case anyone else was in the same situation as me.
So, MultiFormatTranscriber is particularly useful for those who have to transcribe large amounts of files in different formats.
To run MultiFormatTranscriber, you need the following Python packages installed:
You can install them using pip:
pip install whisper torch
Additionally, the program requires FFmpeg for media file conversion.
FFmpeg is a crucial component for handling various audio/video formats and converting them when necessary. You need to install it separately and ensure it's accessible from your system's command line (i.e., it's in your system's PATH).
Download FFmpeg
Go to the official FFmpeg download page: https://ffmpeg.org/download.html
Download the version appropriate for your operating system (Windows, macOS, Linux). For Windows, you'll typically download a .zip file. For macOS and Linux, you might prefer using a package manager.
Install FFmpeg and Add to PATH
Windows
C:\FFmpeg).bin subfolder (e.g., C:\FFmpeg\bin) to your system's PATH environment variable:
Path and select it.bin folder (e.g., C:\FFmpeg\bin).macOS
Homebrew usually handles adding it to your PATH automatically.brew install ffmpeg
bin directory to your PATH by editing your shell's configuration file (e.g., ~/.zshrc or ~/.bash_profile):
Replaceexport PATH="/path/to/your/ffmpeg/bin:$PATH"
/path/to/your/ffmpeg/bin with the actual path. Then run source ~/.zshrc or open a new terminal window.Linux
sudo apt update && sudo apt install ffmpeg
sudo dnf install ffmpeg
sudo pacman -S ffmpeg
bin directory to your PATH by editing your shell's configuration file (e.g., ~/.bashrc, ~/.zshrc):
Then runexport PATH="/path/to/your/ffmpeg/bin:$PATH"
source ~/.bashrc or open a new terminal window.ffmpeg -version
and press Enter. If FFmpeg is installed correctly and in your PATH, you should see version information; otherwise, recheck your PATH configuration.
Run the program using Python:
python transcriber.py
The program will guide you through a series of prompts to configure the transcription:
In case you want to use a customised file transcription order, all you have to do is rename the files by entering the numbers inside round brackets.
E.g., file_name(1).mp3, (1)file_name.mp3, etc.