TASK: Multiple-Choice Question Answering
METRIC: Accuracy
BENCHMARK: Synthetic legal dataset on the italian "Codice Penale":
Develop a multi-agent system based on open-source LLM for the classification of the correct answer through Web searches.
The aim of the project is to improve the performances of the LLM without training/fine-tuning.
Agent library: langroid
Open-source provider: Ollama
LLM: Phi-3.5
The multi-agent system is composed of 4 agents: