SubtitleTranslator is a Python-based tool designed to recognize and translate game subtitles in real time. It captures subtitle regions from a game's screen using computer vision, applies optical character recognition (OCR) to extract text, translates that text into a user-specified language using a pre-trained translation model, and displays the translated subtitles as the game runs. The project merges techniques from computer vision, natural language processing, and GUI development to overlay translations directly onto gameplay footage.
Area Selection
Utilizes a Tkinter-based interface via AreaSelectScreen.py to let users visually select the screen region where subtitles appear. The user draws a rectangle that defines this region, which is then confirmed or canceled.
Screen Capture & OCR
The script continuously captures the defined region from the game screen using MSS and NumPy. PyTesseract is used to perform OCR on the captured region to extract subtitle text.
Translation
A pre-trained Hugging Face translation model (such as Helsinki-NLP's MarianMT) is loaded using Hugging Face's MarianMTModel and MarianTokenizer. The extracted subtitle text is tokenized, translated, and decoded into the target language.
Display / Output
Translated text is printed to the console using a processing function (e.g., process_subtitles()). The translation loop continues until the user presses 'q' to quit.
Main Script Orchestration
SubtitleTranslator.py orchestrates the flow: area selection, model initialization (including setting target language and model), then launching the translation loop via translator.run_translation().
The tool enables real-time recognition, translation, and display of game subtitles, making it easier for users to understand foreign-language game dialogue as they play.
From the main script, an example target language ('tr' for Turkish) and translation model ('Helsinki-NLP/opus-mt-tc-big-en-tr') are configured, illustrating a practical use case.
It supports multiple libraries and frameworks, including OpenCV, MSS, PyTesseract, Tkinter for GUI, and Hugging Face Transformers for translation—all under a straightforward MIT license.