Code: https://github.com/mktcode/openai-tool-runner
Demo: https://markus-kottlaender.de/#anfrage (German)
This package is a wrapper around the OpenAI API, allowing you to replace the baseURL and use it with Ollama and compatible models. I only tested it with GPT-4o.
It enables running tools in longer sequences without generating a response after each set of tool calls, unlike the standard "Input -> Tool(s) -> Response" flow described in the OpenAI documentation. Instead, it supports "Input -> Tool(s) -> Tool(s) -> ... -> Response." and even the response is actually a tool call.
In the free runner you define "stop tools" which end the chain of tool calls. You can provide a tool like "final_answer" and display it in a frontend accordingly. You can also choose not to provide a stop tool and let the runner run indefinitely.
The straight runner is more like a fixed workflow.
Using a free runner as the main chatbot, with additional free and straight runners as tools, can yield interesting results.
import { createFreeRunner, ToolChain, createSystemMessage, createUserMessage } from 'openai-tool-runner' import { planResearchTool, webSearchTool, provideFinalAnswerTool, askUserTool } from './tools' const systemMessage = createSystemMessage(`You are...`) const chatHistory = [createUserMessage(`What is...`)] const toolChain = new ToolChain({ tools: [ planResearchTool, webSearchTool, provideFinalAnswerTool, askUserTool, ], stopWhen: [ provideFinalAnswerTool, askUserTool, ] }) const runner = createFreeRunner({ systemMessage, chatHistory, toolChain }) for await (const message of runner()) { console.info(message) }
import { createStraightRunner, ToolChain, createSystemMessage, createUserMessage } from 'openai-tool-runner' import { searchTool, analyzeTool, provideFinalAnswerTool } from './tools' const systemMessage = createSystemMessage(`You are...`) const chatHistory = [createUserMessage(`What is...`)] const toolChain = new ToolChain({ tools: [ searchTool, analyzeTool, provideFinalAnswerTool, ], }) const runner = createStraightRunner({ systemMessage, chatHistory, toolChain }) for await (const message of runner()) { console.info(message) }
Instead of streaming individual tokens, you can only stream entire messages. I believe token streaming is not that relevant in many situations. Streaming tool calls that take significant time, like API calls, is more important. The runners are async generators, that yield tool calls.
I use Nuxt 3 for my frontend, which utilizes h3 and its handy function sendIterable
. You can pass the runner to it. For Next.js or other frameworks, there should be similar solutions. Here's a part of an endpoint in my application:
export default defineEventHandler(async (event) => { const { chatHistory }: { chatHistory: AgentMessage[] } = await readBody(event) const { openaiApiKey, tavilyApiKey } = useRuntimeConfig(event) const systemMessage = createSystemMessage(`You are ... Today's date: ${new Date().toISOString().slice(0, 16)} Your knowledge cutoff: 2023-10`) const webSearchTool = new WebSearchTool(tavilyApiKey) const askWebsiteTool = new AskWebsiteTool(openaiApiKey) const provideFinalAnswerTool = new ProvideFinalAnswerTool() const toolChain = new ToolChain({ tools: [ webSearchTool, askWebsiteTool, provideFinalAnswerTool, ], stopWhen: [ provideFinalAnswerTool, ] }) return sendIterable(event, createFreeRunner({ apiKey: openaiApiKey, systemMessage, chatHistory, toolChain })) })
You can read the messages using the provided readToolStream
function:
import { readToolStream } from 'openai-tool-runner' let loading = true const chatHistory = [] const stream = await fetch('/api/agent', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ chatHistory }), }).then((res) => res.body.getReader()) readToolStream(stream, (message) => { chatHistory.push(message) }, () => loading = false)
There are no datasets linked
There are no datasets linked