Weβll now βproductionizeβ the Docker-based system from earlier using:
β Helm chart for easy installation and environment management
β Kubernetes manifests for API, worker, Redis, and Qdrant
β ConfigMaps / Secrets / Autoscaling setup
β PersistentVolumeClaim for Qdrant storage
Everything will be ready to helm install on any cluster (EKS, GKE, AKS, or local minikube).
βοΈ Folder structure
helm-ai-assistant/
βββ Chart.yaml
βββ values.yaml
βββ templates/
β βββ api-deployment.yaml
β βββ worker-deployment.yaml
β βββ redis-deployment.yaml
β βββ qdrant-deployment.yaml
β βββ api-service.yaml
β βββ redis-service.yaml
β βββ qdrant-service.yaml
β βββ configmap.yaml
β βββ secret.yaml
β βββ hpa.yaml
βββ README.md
π§© Chart.yaml
apiVersion: v2
name: ai-business-assistant
description: Helm chart for AI Business Assistant (RAG + Multi-Agent System)
type: application
version: 0.1.0
appVersion: "1.0"
βοΈ values.yaml
replicaCount:
api: 2
worker: 3
image:
api: "yourdockerhubuser/ai-assistant-api
service:
apiPort: 8000
redisPort: 6379
qdrantPort: 6333
resources:
api:
limits:
cpu: "500m"
memory: "512Mi"
worker:
limits:
cpu: "1"
memory: "1Gi"
env:
CELERY_BROKER_URL: "redis://redis:6379/0"
CELERY_RESULT_BACKEND: "redis://redis:6379/1"
QDRANT_URL: "http://qdrant:6333"
OPENAI_API_KEY: "changeme"
storage:
qdrant:
size: 1Gi
ποΈ templates/secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: ai-secret
type: Opaque
stringData:
OPENAI_API_KEY: {{ .Values.env.OPENAI_API_KEY | quote }}
βοΈ templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: ai-config
data:
CELERY_BROKER_URL: {{ .Values.env.CELERY_BROKER_URL | quote }}
CELERY_RESULT_BACKEND: {{ .Values.env.CELERY_RESULT_BACKEND | quote }}
QDRANT_URL: {{ .Values.env.QDRANT_URL | quote }}
π templates/api-deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: ai-api
spec:
selector:
app: ai-api
ports:
- protocol: TCP
port: {{ .Values.service.apiPort }}
targetPort: {{ .Values.service.apiPort }}
type: LoadBalancer
βοΈ templates/worker-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ai-worker
spec:
replicas: {{ .Values.replicaCount.worker }}
selector:
matchLabels:
app: ai-worker
template:
metadata:
labels:
app: ai-worker
spec:
containers:
- name: worker
image: {{ .Values.image.worker }}
command: ["celery", "-A", "tasks.celery", "worker", "--loglevel=info", "-Q", "retrieval,generation,critique,summarize,orch"]
envFrom:
- configMapRef:
name: ai-config
- secretRef:
name: ai-secret
resources:
{{- toYaml .Values.resources.worker | nindent 10 }}
βοΈ templates/redis-deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
selector:
app: redis
ports:
- protocol: TCP
port: {{ .Values.service.redisPort }}
targetPort: {{ .Values.service.redisPort }}
βοΈ templates/qdrant-deployment.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: qdrant-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.storage.qdrant.size }}
βοΈ templates/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ai-worker-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ai-worker
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
π README.md
This chart deploys a production-ready RAG + Multi-Agent pipeline using Celery on Kubernetes.
helm installedapi and worker)helm install ai-assistant ./helm-ai-assistant --set env.OPENAI_API_KEY="sk-..." 3οΈβ£ Get service URL kubectl get svc ai-api Then open EXTERNAL-IP:8000 in your browser or call: curl -X POST http://<EXTERNAL-IP>:8000/v1/query \ -H "Content-Type: application/json" \ -d '{"user_id":"u1","query":"AI trends in 2025"}' 4οΈβ£ Upgrade helm upgrade ai-assistant ./helm-ai-assistant 5οΈβ£ Uninstall helm uninstall ai-assistant --- π‘ Notes Qdrant uses a PersistentVolume for vector data. Scale workers automatically with the included HPA. API is exposed as a LoadBalancer for external access. Add ingress.yaml to route traffic through NGINX if required. --- ## π§ Next Step Options Would you like me to: 1. **Generate ready-to-publish Dockerfiles** for `api` and `worker` (so you can push to Docker Hub for the Helm deployment)? 2. Or directly **bundle this Helm chart as a `.tgz` package** you can upload to your cluster? Which one should I prepare next?