ClippyAI - A Local AI Agent

Title_Image

Introduction

As a developer, I’ve always been passionate about creating tools that solve real-world problems. But there was one issue that consistently irked me: the never-ending stream of repetitive emails. Whether it was customer inquiries, tech support requests, or project updates, my inbox overflowed with similar questions day in and day out. It was like Groundhog Day, but with email threads.

The Annoyance Factor

Picture this: You’re sipping your morning coffee, already diving into some exciting coding challenges, and suddenly, ping! —another email lands in your inbox. It’s the same query you’ve answered a hundred times before. You sigh, type in your well-crafted response, and hit send. Rinse and repeat. It’s not just time-consuming; it’s soul-draining, because you totally lose focus of your coding work.

The Eureka Moment

One fateful afternoon, as I stared at my screen, contemplating the meaning of life (and another email), it hit me: Why not build an AI agent that assists you on these repetitive tasks?
Of course could I just use ChatGPT, but that's a no-go at my company due to data privacy reasons. It's also very distracting and time-consuming to navigate to the website and copy and paste the source email, ask ChatGPT to write an answer and copy and paste the reply back to your email application.

The data protection part is easily solvable: With today's modern CPUs and GPUs it is possible to use Ollama and host the interference of an AI model of your choice locally at reasonable speed.

But how to integrate an agent for Ollama into your OS?

The DeepL Windows desktop app came into mind, where you just hit Ctrl+C twice and the app instantly translates the text you selected.

So my idea was to create a daemon who watches the clipboard for changes and then sends the content along with a task description to Ollama.

ClippyAI wouldn’t just suggest — I wanted it to take action. When I hit the shortkeys, it would automatically type out the response. Imagine the joy of watching ClippyAI do the grunt work while I sipped my coffee.

I chose the name ClippyAI as a mixture of "Clipboard" and "AI" and it was also inspired by the nostalgic Microsoft Office paperclip, which everyone hated these days.

Because I'm mostly a .NET developer, I used .NET as foundation. I wanted to create a multi-platform application, because I'm using Windows at work and Linux at home, so I chose the Avalonia framework for this project.

After I realized, that the main idea was working well, I extended the ClippyAI's tasks to not just answering emails, but to also explain or translate the copied text or even to do some custom user defined tasks with it.

Screenshot

Using a Vector DB and Embedding LLM as Response Cache

The only bottleneck so far was the execution speed of the LLM interference, because to run larger models in Ollama at an adequate speed requires at least a modern CPU, or better a dedicated GPU.

The idea to use an embedding LLM along with a vector database came into my mind. The database could serve as a cache to store the most common answers, so that there would be not always be a need to generate them.

So I integrated an embedding LLM (nomic-embed-text) hosted by Ollama, a PostgreSQL vector database, and pg.ai to create a system that caches embeddings of template answers in a vector database. This setup allows for rapid retrieval of templates for similar questions or tasks, significantly reducing response times to only a few seconds on modern CPUs.

Basic Concept of Embeddings

Pg.ai provides us with the command ollama_embed, which can be used to send text to the embedding LLM nomic-embed-text hosted by Ollama. As result, it will return a vector with 768 dimensions. Such vector can be imagined like a compressed semantic description of the text. When you compare two resulting vectors of different text inputs by its euclidian distance, the meaning is the decisive factor, not the similarity in terms of the words or characters as it would be with classical string distance functions like the levenshtein algorithm.

Now, instead of directly passing the clipboard data and the task to a generative LLM hosted by Ollama, it is first being sent to a PostgreSQL database.

Storing Embeddings

Embeddings are high-dimensional vectors that represent the semantic meaning of text. A vector database stores these embeddings to enable rapid similarity searches.

To use this concept, we must first fill our vector database with data.
Before storing, we concatenate the clipboard data with the task description in the variable @question.
Then we calculate the embedding vector for the @question variable and store it together with the answer generated by the general purpose LLM:

INSERT INTO clippy (question, answer, embedding_question, embedding_answer)
SELECT
  @question,
  @answer,
  ai.ollama_embed('nomic-embed-text', @question);

From the ClippyAI GUI, this function will be executed when the thumb-up button is clicked, or the Store all responses as embeddings mode is active.

Embeddings which should not serve as templates, can also be removed by clicking the thumbs-down button.

Retrieving Answers from Embeddings

By using the extension pg.ai, the database can execute request to Ollama directly from a SQL query:

SELECT
  answer, 
  embedding_question <->
    ai.ollama_embed('nomic-embed-text', @question)
    as distance
FROM clippy
WHERE embedding_question <->
  ai.ollama_embed('nomic-embed-text', @question) <= @threshold
ORDER BY 3;

The <-> operator calculates the distance of two vectors and we are only taking results, which are below the user-specified @threshold variable.
Finally, we order the result set descending by the distance.
On the GUI side, users can scroll through the answers by pressing the >> button.

Demo

Before submitting a task:

Image description

After submitting a task:

Image description

Benefits of this integration

Enhanced Data Privacy: All data processing happens locally, ensuring high levels of data privacy.
Efficient Text Processing: Using an embedding LLM for text comparisions provides better results than just calculating the distance two strings, because the similarity is measured by a semantic likeliness.
Scalable AI Solutions: Combining PostgreSQL with pgai and pgvector allows for scalable and efficient AI solutions.
Cross-Platform Support: This setup works seamlessly on both Windows and Linux platforms.

Getting Started

Install Ollama from https://ollama.com.
Download the latest version at https://github.com/MrDoe/ClippyAI.
See the installation instructions for how to set up the PostgreSQL database with pg.ai.

Early Development Phase

While ClippyAI shows some promise, it’s essential to note that it’s still in its early development phase. As with any cutting-edge technology, there are risks involved. Here’s what you should be aware of:

Use it at your own risk: ClippyAI is experimental. It may occasionally produce unexpected results or errors. Always double-check the generated content before finalizing it.
Document safety: ClippyAI may unintentionally delete or overwrite your existing documents, if you are using the keyboard output and Auto-Mode. So be careful where to place your cursor!
Developers Wanted: ClippyAI is an open-source project that is open for contributions from developers like you. If you want to join the development, clone the repo and submit a pull request!

Tools Used

In my project, I utilized several powerful tools to build the AI system:

pgvector: This PostgreSQL extension allows for efficient storage and retrieval of high-dimensional vector data, enabling rapid similarity searches.
pgai: Provided the ability to execute requests to Ollama directly from SQL queries, making the integration seamless and efficient.
pgai Vectorizer: The vectorizer tool from pgai was used to generate embeddings from the text, which are then stored in the vector database.
Ollama with nomic-embed-text: This embedding LLM was crucial for transforming text into high-dimensional vectors that capture semantic meaning.
Docker: I used the PostgreSQL + pgai container to set up and run the database environment smoothly.
.NET SDK 9.0: Open Source Framework for C# applications from Microsoft.
Avalonia: Platform-independent UI Framework for .NET

Conclusion

ClippyAI is still a work in progress. It won’t win any Turing Awards yet, but it’s a little side project I want to extend further. So, the next time you receive a prompt reply from me, know that ClippyAI is doing its thing. And if it ever goes rogue, blame the coffee.

Disclaimer: ClippyAI may occasionally channel its inner HAL 9000. Use at your own risk.

Title_Image

Introduction

The Annoyance Factor

The Eureka Moment

The data protection part is easily solvable: With today's modern CPUs and GPUs it is possible to use Ollama and host the interference of an AI model of your choice locally at reasonable speed.

But how to integrate an agent for Ollama into your OS?

The DeepL Windows desktop app came into mind, where you just hit Ctrl+C twice and the app instantly translates the text you selected.

So my idea was to create a daemon who watches the clipboard for changes and then sends the content along with a task description to Ollama.

I chose the name ClippyAI as a mixture of "Clipboard" and "AI" and it was also inspired by the nostalgic Microsoft Office paperclip, which everyone hated these days.

Screenshot

Using a Vector DB and Embedding LLM as Response Cache

The only bottleneck so far was the execution speed of the LLM interference, because to run larger models in Ollama at an adequate speed requires at least a modern CPU, or better a dedicated GPU.

Basic Concept of Embeddings

Now, instead of directly passing the clipboard data and the task to a generative LLM hosted by Ollama, it is first being sent to a PostgreSQL database.

Storing Embeddings

Embeddings are high-dimensional vectors that represent the semantic meaning of text. A vector database stores these embeddings to enable rapid similarity searches.

INSERT INTO clippy (question, answer, embedding_question, embedding_answer)
SELECT
  @question,
  @answer,
  ai.ollama_embed('nomic-embed-text', @question);

From the ClippyAI GUI, this function will be executed when the thumb-up button is clicked, or the Store all responses as embeddings mode is active.

Embeddings which should not serve as templates, can also be removed by clicking the thumbs-down button.

Retrieving Answers from Embeddings

By using the extension pg.ai, the database can execute request to Ollama directly from a SQL query:

SELECT
  answer, 
  embedding_question <->
    ai.ollama_embed('nomic-embed-text', @question)
    as distance
FROM clippy
WHERE embedding_question <->
  ai.ollama_embed('nomic-embed-text', @question) <= @threshold
ORDER BY 3;

Demo

Before submitting a task:

Image description

After submitting a task:

Image description

Benefits of this integration

Enhanced Data Privacy: All data processing happens locally, ensuring high levels of data privacy.
Efficient Text Processing: Using an embedding LLM for text comparisions provides better results than just calculating the distance two strings, because the similarity is measured by a semantic likeliness.
Scalable AI Solutions: Combining PostgreSQL with pgai and pgvector allows for scalable and efficient AI solutions.
Cross-Platform Support: This setup works seamlessly on both Windows and Linux platforms.

Getting Started

Install Ollama from https://ollama.com.
Download the latest version at https://github.com/MrDoe/ClippyAI.
See the installation instructions for how to set up the PostgreSQL database with pg.ai.

Early Development Phase

Use it at your own risk: ClippyAI is experimental. It may occasionally produce unexpected results or errors. Always double-check the generated content before finalizing it.
Document safety: ClippyAI may unintentionally delete or overwrite your existing documents, if you are using the keyboard output and Auto-Mode. So be careful where to place your cursor!
Developers Wanted: ClippyAI is an open-source project that is open for contributions from developers like you. If you want to join the development, clone the repo and submit a pull request!

Tools Used

In my project, I utilized several powerful tools to build the AI system:

pgvector: This PostgreSQL extension allows for efficient storage and retrieval of high-dimensional vector data, enabling rapid similarity searches.
pgai: Provided the ability to execute requests to Ollama directly from SQL queries, making the integration seamless and efficient.
pgai Vectorizer: The vectorizer tool from pgai was used to generate embeddings from the text, which are then stored in the vector database.
Ollama with nomic-embed-text: This embedding LLM was crucial for transforming text into high-dimensional vectors that capture semantic meaning.
Docker: I used the PostgreSQL + pgai container to set up and run the database environment smoothly.
.NET SDK 9.0: Open Source Framework for C# applications from Microsoft.
Avalonia: Platform-independent UI Framework for .NET

Conclusion

Disclaimer: ClippyAI may occasionally channel its inner HAL 9000. Use at your own risk.

ClippyAI - A Local AI Agent

Table of contents

Introduction

The Annoyance Factor

The Eureka Moment

Using a Vector DB and Embedding LLM as Response Cache

Basic Concept of Embeddings

Storing Embeddings

Retrieving Answers from Embeddings

Demo

Benefits of this integration

Getting Started

Early Development Phase

Tools Used

Conclusion

Table of contents

Introduction

The Annoyance Factor

The Eureka Moment

Using a Vector DB and Embedding LLM as Response Cache

Basic Concept of Embeddings

Storing Embeddings

Retrieving Answers from Embeddings

Demo

Benefits of this integration

Getting Started

Early Development Phase

Tools Used

Conclusion