RAG assistant uses LLM, vector DB, embeddings; system+summary prompts; save JSON history, display it