Sanjeevan is a video calling application designed to bridge the communication gap for people who cannot speak. Our mission is to empower individuals with diverse abilities to express themselves freely and connect with others effortlessly. Through innovative technology and machine learning, we aim to create an inclusive world where everyone's voice is heard, regardless of linguistic or physical barriers.
Though we made the project for the hackathon purpose, which we did win by the way, the purpose was clear: to connect the bridge between a person who can't speak and a person who can't understand sign language.
Short Working video: Youtube
The application leverages MediaPipe to capture hand landmarks and process them using a trained RandomForestClassifier to identify gestures. The hand gestures are mapped to alphabets and common phrases using predefined labels, allowing users to communicate through sign language.
model.p
). It predicts hand signs based on MediaPipe's hand landmark detection.To get started, follow these steps:
Clone the repository:
git clone https://github.com/vaibhavkothari33/Hackfest.git
Install the required dependencies:
pip install -r requirements.txt
Run the FastAPI server:
uvicorn app:app --reload
Here's an example of how gestures are detected and processed:
results = hands.process(frame_rgb) if results.multi_hand_landmarks: for hand_landmarks in results.multi_hand_landmarks: # Process landmarks for gesture prediction prediction = model.predict([np.asarray(data_aux)]) predicted_character = labels_dict[int(prediction[0])]
Example code of how speech to text works
def speech_to_text(): recognizer = sr.Recognizer() with sr.Microphone() as source: print("Please say something...") audio = recognizer.listen(source) try: text = recognizer.recognize_google(audio) print(f"Recognized text: {text}") return text except sr.UnknownValueError: print("Sorry, I did not understand that.") except sr.RequestError: print("Sorry, I couldn't request results; check your network connection.") return None
To play the video in Sequence :(for showing the user video to learn sign language)
def play_videos_in_sequence(video_paths): for video_path in video_paths: cap = cv2.VideoCapture(video_path) if not cap.isOpened(): print(f"Error opening video file {video_path}") continue while cap.isOpened(): ret, frame = cap.read() if not ret: break frame = cv2.resize(frame, (640, 360)) # Smaller frame size for faster rendering cv2.imshow('Video', frame) if cv2.waitKey(10) & 0xFF == ord('q'): # Faster playback cap.release() cv2.destroyAllWindows() return cap.release() cv2.destroyAllWindows()
-Gen AI: For the real time sign language generation
You can check out Sanjeevan at Sanjeevan Demo
The app that we built is not currently live because, as students, it costs us a lot to keep it live and handle such a large dataset. We are sorry for any inconvenience caused. However, we do have a very simple app that is live, which was made by us and solves the same problem. The only difference is that it supports 7 hand signs.
Rest assured, the GitHub links are working. You can visit the site and the video calling application is also running.
Working video calling application : Sanjeevan Demo
Simple Sanjeevan1.0 Sanjeevan
Github repo: Github