Official repository for the project comparing the SAINT (Self-Attention and Intersample Transformer) model with traditional models (LightGBM, XGBoost, CatBoost) and AutoML tools (AutoGluon, auto-sklearn) across 30 tabular datasets from the OpenML-CC18 benchmark.
This is the final project for the Machine Learning course (2025), with the following objectives:
To ensure full reproducibility, it is highly recommended to use a Conda environment. This will manage all dependencies and guarantee that the correct package versions are used.
First, create a new Conda environment. We will name it saint-project and use Python 3.9, which is compatible with all required libraries.
# Create a new environment named 'saint-project' with Python 3.9 conda create -n saint-project python=3.8 -y
Next, activate the newly created environment. You must do this every time you work on the project in a new terminal session.
# Activate the environment conda activate saint-project
With the environment activated, install all the necessary libraries from the requirements.txt file. This single command will handle the entire installation process.
# Install all required packages pip install -r requirements.txt
You are now ready to run the project scripts. For example, to execute the main comparison script, you would run:
# Example of how to run a script python total_hyper_comparator.py
By following these steps, you will have a clean and isolated environment with all the tools needed to reproduce the results of this analysis.
The detailed findings, methodology, and statistical analysis are available in the full project report and presentation slides.