Introduction:
The objective of this project was to develop a web-based application which translates Tamil text input to English and generates an image based on the translated English text, and provides a creative text based on the translated text.
Phase 1:
Tamil to English Translation
I started with the Helsinki-NLP/opus-mt-ta-en model for Tamil to English translation, initially it showed errors and I discovered that the model was removed and no longer available in hugging face.
Exploring of Alternatives:
I was looking for other Tamil to English translation models on Hugging Face but couldn't find any reliable model.
I also attempted to use the Google Translate API, which required payment and was not available for free use in google cloud console.
Further steps on finding the alternative:
I found out that there is a translation model called Facebook's MBART model and also tried it, but it was not working as accurate as expected.
Inaccurate translation:
I also looked into the Bharath translation model but faced similar challenges in finding a working tamil – English translation model (ta-en model).
Final Solution:
Eventually I discovered the Helsinki multilingual translation model which gave almost accurate translations (Helsinki-NLP/opus-mt-mul-en) and successfully implemented it for Tamil to English translation.
Phase 2:
Image Generation
Initial Model Choice:
Attempted to use the CompVis/stable-diffusion-v1-4 model for image generation, but it was'nt generating the images accurately up to the mark so I had to move on to another model as it produced low quality and inaccurate images.
Alternate text to image model:
I shifted to black-forest-labs/FLUX.1-dev, which had a higher number of downloads (1.2 million) than Shakker-Labs/FLUX.1-dev-LoRA-AntiBlur (800k downloads).
The FLUX model produced more realistic and convincing images, whereas the LoRA AntiBlur model generated slightly cartoonish output.
Phase 3:
Integrating text translation and image generation altogether
Integration:
I had to get the output of the translated english text and feed it as input to the image generation prompt, faced multiple errors during the integration process, which required repeated troubleshooting and learning process.
Phase 4:
Creative text generation
Initially, I tried the creative text generation from the translated text through gpt-neo model, but the model did not generate proper text according to the translated text, It only generated some random information based on the text.
I implemented Google Gemini model using Gemini API to generate creative text based on the translated text.
Phase 5: Development and Interface
Gradio Interface:
I had to developed a user-friendly interface using gradio where the user can access the AI very easily and learning gradio was user friendly and required less coding.
I encountered issues where outputs did not display correctly, requiring step-by-step debugging.
Finally, I had completed the interface part.
Phase 5:
Deploying the application in Hugging face spaces:
The application was deployed on Hugging Face Spaces, allowing users to interact with it directly through a web interface.
The API keys were handled securely and got as input from the user and not hardcoded into the source code.
Securing the API keys:
I have two versions for this project one version gets the API keys for Gemini model and Hugging face and another version just needs the Tamil input text from the user and has all the API keys as in-built securely.
I implemented the API keys in Hugging Face Secrets so that the API kays are secure and cannot be misused
This is the Hugging face Space:
https://huggingface.co/spaces/Raveheart1/Gradio-Transart
USER MANUAL FOR EXECUTION:
Open and execute transart_project_gradio.py file properly
Give a Tamil text input in the input box
Enter you API credentials, the Gemini API key and the Hugging Face API key and click submit or hit the Enter key.
You will get the translated text, creative text and the generated image
I have another version of the project which only requires Tamil text input, it is in the hugging face spaces.
Open my hugging face space:
https://huggingface.co/spaces/Raveheart1/Gradio-Transart
Enter the Tamil text input and hit enter key or click submit.
You will get the generated image, translated text output and a creative text.
CONCLUSION:
This project required a lot of learning and efforts for it to be built, surely the efforts and time did not go in vain, throughout the development process, I encountered and resolved various challenges, particularly in API requesting in Gemini model and exception handling.
I learnt exception handling while working in gradio which required to find out which part of the code generated errors. This project helped me to gain knowledge about the usage of huggingface and Generative AI.
I would like to thank my mentors of this internship who mentored me to do this project.
Using more integrated model can be done to add new features and develop this project.
Link text
Bold text
Bold text
Bold text
There are no models linked
There are no models linked