This project uses the YOLOv8 model for real-time object detection and integrates it with Microsoft Azure services. The system also provides descriptive analysis of detected objects using OpenAI GPT and generates audio output via Azure Text-to-Speech (TTS).
git clone https://github.com/jihyeon26/Vision_detect_project.git cd Vision_detect_project
pip install -r requirements.txt
OPENAI_ENDPOINT=https://your-openai-endpoint OPENAI_API_KEY=your_openai_api_key DEPLOYMENT_NAME=your-deployment-name SPEECH_ENDPOINT=https://your-speech-endpoint SPEECH_API_KEY=your_speech_api_key
python main.py
vision-detection-project/
ā
āāā main.py # Main entry point for the application
āāā yolo_model.py # YOLO model loading and object detection logic
āāā openai_service.py # Communication with OpenAI's GPT for analysis
āāā speech_service.py # Communication with Azure TTS for audio generation
āāā gradio_ui.py # Gradio interface and event listeners
āāā config.py # Environment variable loader
āāā requirements.txt # Python dependencies
āāā README.md # Project documentation
āāā .env # Environment variables (not included in the repo)
There are no datasets linked
There are no datasets linked