We use cookies to improve your browsing experience and to analyze our website traffic. By clicking “Accept All” you agree to our use of cookies. Privacy policy.
6 readsMIT License

Backseat Driver Kid - a curious toddler, designed to explore data analysis with privacy in mind

Table of contents

Abstract

The Backseat Driver Kid project is a CLI tool designed to learn how to work with Large Language Models (LLMs) such as Llama2 using the Ollama framework and the LangChainGo library. The project focuses on developing smart applications that respect privacy by analyzing data without exposing sensitive information to the internet.

Methodology

Key Design Choices

  1. CLI-Based Interaction
    The tool uses a CLI to interact with users. Users provide input parameters, including a YAML configuration file that specifies the interaction context, and a list of prompts that the system should process. The CLI interface ensures that the tool remains flexible and easy to use, even as requirements evolve.

  2. Domain-Driven Design (DDD)
    The project applies DDD principles to structure the code. This helps ensure that the system can easily adapt to new requirements, such as switching between models or integrating different libraries and frameworks. The domain is designed around the interaction between the system and users, with clear boundaries between different parts of the system, such as:

    • Model: The language model (e.g., Llama2)
    • Prompt Handling: Processing and managing user prompts
    • Knowledge Management: Managing knowledge files and external data sources
  3. Flexibility in Models and Frameworks
    One of the main goals is to maintain the flexibility to switch models and libraries easily. This is achieved by decoupling the model implementation from the rest of the system. The tool is designed to allow easy integration of different models and frameworks by simply modifying the CLI parameters or configuration files.

  4. Output Format
    The output of the tool is designed as a Markdown document, making it easy to share and review the results. This format is ideal for scenarios where large amounts of data need to be analyzed, as it provides a structured and readable way to present the insights generated from the model's responses.

  5. Question-Answer Mechanism
    The system is designed to handle a list of questions and generate answers based on the provided knowledge. This is particularly useful for analyzing large datasets, identifying patterns, bugs, and issues. By organizing the knowledge and questions in a structured way, the tool facilitates the analysis of large volumes of records, helping users to focus on the most important insights.

Flexibility and Future Extensions

The project is structured in a way that makes it easy to add new features, such as supporting additional models, implementing Retrieval-Augmented Generation (RAG) - not implement yet for better data retrieval, or extending the question-answering capabilities. This flexibility ensures that the project can evolve to meet future needs and incorporate new advancements in machine learning and data analysis.

Results

The results of the Backseat Driver Kid project are based on the interaction between the system and the provided knowledge, processed via a set of predefined prompts. The system uses a YAML configuration file to guide the process and outputs the analysis as a Markdown document.

Input

The input for the system consists of a YAML configuration file that defines the interaction context, instructions, and a list of prompts. This file also specifies the knowledge base to be used for the analysis.

You can view the YAML configuration file used for this run here:
Interaction Configuration (YAML)

Output

After processing the input and prompts, the system generates a Markdown document that summarizes the findings and provides answers to the questions posed in the configuration file. The output is designed to be readable and informative, serving as a document that can be used to analyze large volumes of records and identify patterns, issues, and bugs.

You can view the output generated for this run here:
Analysis Output (Markdown)

Models

There are no models linked

Datasets

There are no datasets linked