A Coach-Based Quality-Diversity Approach for Multi-Agent Interpretable Reinforcement Learning