Feb 27, 2025●15 reads●MIT License

Building My Own Jarvis: An AI Adventure

ChatGPT
Coffee
Jarvis
Nestjs
OpenAI API
Tutorial

j
@jeremy.brunel.fullst

Building My Own Jarvis: An AI Adventure

Introduction

In this project, I set out to create my own AI assistant, inspired by Tony Stark’s Jarvis. The goal was to integrate OpenAI’s tools into a NestJS backend to develop an intelligent, interactive assistant capable of responding to user queries through both text and speech.

Inspiration

The idea stemmed from two main motivations:

Personal Growth: A desire to explore AI integration within my IT projects.
Real-World Use Cases: Seeing a friend use ChatGPT to prep for interviews sparked the realization that AI could enhance speech practice, debate preparation, and even casual conversations.

Technical Overview

Core Technologies

NestJS: The backbone of the project, providing a structured and modular framework.
OpenAI API: Powering AI-driven responses.
Prompt Engineering: Carefully crafting AI prompts to improve interactions.

System Architecture

The project is built with multiple NestJS services working together:

ChatOpenaiConnectorService: Connects to OpenAI's API to retrieve AI responses.
ChatHistoryManager & ChatSession: Maintains conversation history across sessions.
ChatGptApiService: Manages chat flow, including session creation and AI response retrieval.
TextToSpeechService & SpeechToTextService: Converts text into speech and vice versa (Speech-to-Text planned for future updates).
ChatSecurityService: Filters AI responses to ensure they adhere to security guidelines.
SpeakerService: The main orchestrator that integrates all the above services into a unified AI assistant.

Key Functionalities

Chat Sessions

Users can initiate a chat session with different AI personalities. Each session has a unique UUID, allowing for personalized and continuous interactions.

OpenAI API Integration

NestJS communicates with OpenAI using the ChatOpenAI model, storing responses in a structured manner. A sample implementation

OpenAI API Integration

Users can receive AI-generated responses in audio format. This is handled via an API call to OpenAI’s Text-to-Speech (TTS) service.

Chat Security

AI responses are screened for inappropriate content using ChatSecurityService, which relies on OpenAI for content moderation. The security rules are defined in a system prompt loaded at runtime.

SpeakerService

The SpeakerService ties everything together. It allows users to select an AI personality, interact via text or speech, and ensures responses pass through security checks.

Demo

Creating a session: Users initiate a session with a personality choice.
Interacting via text: Sending text messages to receive AI responses.
Interacting via speech: Sending messages through the verbal endpoint to receive spoken responses.
Security validation: Ensuring AI responses comply with predefined rules.

Medium article: https://medium.com/@jeremy.brunel.fullstack/building-my-own-jarvis-an-ai-adventure-7c9bcfb74dc0
GitHub project: https://github.com/jbfullstack/home-made-jarvis-backend

Building My Own Jarvis: An AI Adventure

Table of contents

Building My Own Jarvis: An AI Adventure

Introduction

Inspiration

Technical Overview

Core Technologies

System Architecture

Key Functionalities

Chat Sessions

OpenAI API Integration

OpenAI API Integration

Chat Security

SpeakerService

Demo

Table of contents