Environmental Sound Classification Based on Vision Transformers