Online grocery shopping voice agent
Table of contents
Abstract
This project showcases an innovative agent-based interaction with the online grocery store Picnic, a Dutch supermarket that delivers groceries directly to customers’ doors. As a fan of their service, I wanted to build a fun and practical AI-powered assistant that integrates Picnic’s API—without relying on large agent frameworks like LangGraph or Swarm.
For the AI model, I selected Google’s Gemini 2.0 Flash, a cutting-edge, multimodal model capable of processing and generating both text and voice natively. This project demonstrates how a lightweight, custom-built agent can effectively interact with Picnic’s platform, streamlining grocery shopping through AI.
Methodology
Since Picnic’s API is not publicly available, I used a combination of Genymotion (an Android emulator) and Charles Proxy (a web debugging tool) to intercept API calls made by the Picnic Android app. This approach provided programmatic access to all in-app functionalities, including product search, recipe search, and cart management.
I then converted the necessary API calls into a set of callable functions (tools), enabling structured interactions with Picnic’s services. The key tools I implemented are:
- search_for_products – Find specific products in the Picnic store.
- add_product_to_cart – Add a selected product to the shopping cart.
- remove_product_from_cart – Remove an item from the cart.
- search_for_recipes – Browse recipes available on Picnic.
- add_recipe_to_cart – Add all ingredients from a recipe to the cart.
- search_for_cheaper_product_alternative – Identify cost-effective alternatives to a selected product.
- replace_existing_product – Swap a product in the cart for a different one.
- get_all_current_products_in_cart – Retrieve a list of all items currently in the shopping cart.
To enable real-time AI-driven interactions, I established a socket connection to the Gemini 2.0 Flash API, leveraging its built-in function-calling capability. Since Gemini 2.0 Flash is multimodal, it can seamlessly process and generate both text and voice outputs, creating a more dynamic and natural shopping assistant experience.
Results
To demonstrate the agent’s capabilities, I created a demo video showcasing the interaction. The video visually highlights the agent’s decision-making process and the actions it performs within the Picnic app.
Disclaimer
This project was a fun, private project of mine and is not associated in any way with Picnic or any other company/person. Idea and implementation are my own and are not intended to be used in any commercial context. The API could have been a completely different one. I have chosen the Picnic API, simply because I am a big fan of their service.
Models
There are no models linked
Datasets
There are no datasets linked
You might be interested
Intelligent Agent: "When Bots & Human Team Up" - eCommerce Chatbot With Effective Conversation
16 reads- Agentic AI
- bots
- +5
LLM-Powered Chatbot: Shopping Online Clerk Chatbot using OpenAI and Database
- e
4 reads- Generative AI
- LLM
- +1
Agentic RAG - AI Waiter Agent
- t
8 reads- Agent
- Chatbot
- +7
Pizza Order Chatbot
- a
5 reads- chatbot
- langchain
- +2