Due to the lack of flexibility in current end-to-end video generation systems and their limitations in real-world applications, we developed this project to enable fine-grained control over multimedia content generation through text, images, and audio. The project addresses the following key challenges:
This project is submitted as part of Ready Tensor's Computer Vision Projects Expo 2024. It demonstrates an innovative approach to automated video content creation by combining computer vision, text processing, and audio synthesis.
This Python-based video creation tool leverages computer vision and AI to automatically generate engaging video content by intelligently combining images, text, and audio narration.
examples:
input_video:
output_video:
The project uses HuggingFace's OpenMusic for AI-powered background music generation. This feature allows:
To use the OpenMusic BGM generation:
The project uses Vchitect-2.0, a state-of-the-art video diffusion model, for generating the initial video content. Features include:
conda create -n VchitectXL -y conda activate VchitectXL conda install python=3.11 pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y pip install -r requirements.txt
python inference.py --test_file assets/test.txt --save_dir "${save_dir}" --ckpt_path "${ckpt_path}"
The generated raw videos can then be enhanced with:
Clone this repository:
git clone https://github.com/yourusername/video-creator.git
cd video-creator
Install the required packages:
pip install moviepy pillow pyttsx3 opencv-python
The script can be run from the command line with the following arguments:
python video_creator.py <video_path> <audio_script_path> <bg_music_path> <subtitles_path> <output_path>
Arguments:
video_path
: Path to input video fileaudio_script_path
: Path to audio script text filebg_music_path
: Path to background music filesubtitles_path
: Path to subtitles fileoutput_path
: Path for output video fileExample:
python video_creator.py input_video.mp4 audio_script.txt background_music.wav audio_script.txt output_video.mp4
For help on usage:
python video_creator.py --help
Before running the script, ensure you have:
The script will validate all input files and create the output directory if it doesn't exist.
video_creator.py
: Main script for video generationvideo_utils.py
: Utility functions and CV algorithmsaudio_script.txt
: Narration script templateWe welcome contributions! Please feel free to submit Pull Requests or open issues for discussion.
This project is part of Ready Tensor's Computer Vision Projects Expo 2024, showcasing innovative applications of computer vision and multi-modal AI technologies.
This project is open source and available under the MIT License.
#ComputerVision
#AI
#VideoGeneration
#TextToVideo
#TextToSpeech
#DeepLearning
#VideoProcessing
#AudioSynthesis
#SubtitleGeneration
#BackgroundMusicGeneration
#MultiModalAI
#Vchitect
#OpenMusic
#EdgeTTS
#MoviePy
#PyTorch
#ContentCreation
#MediaProduction
#AutomatedVideoEditing
#AIAssistant
#CreativeAI
There are no models linked
There are no datasets linked
There are no datasets linked
There are no models linked