An innovative image captioning application leveraging Salesforce's BLIP (Bootstrapped Language-Image Pretraining) model to generate human-like descriptions of uploaded images. The system demonstrates the practical application of state-of-the-art AI in creating accessible and intuitive image understanding tools.
cd backend python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows pip install -r requirements.txt python app.py
cd frontend npm install npm run dev
graph LR A[User Upload] --> B[React Frontend] B --> C[Flask Backend] C --> D[BLIP Model] D --> E[Caption Generation] E --> B