AutoML Exam - SS24 (Vision Data)

This repo serves as a template for the exam assignment of the AutoML SS24 course at the university of Freiburg.

The aim of this repo is to provide a minimal installable template to help you get up and running.

Installation

To install the repository, first create an environment of your choice and activate it.

For example, using venv:

You can change the python version here to the version you prefer.

Virtual Environment

python3 -m venv automl-vision-env
source automl-vision-env/bin/activate

Conda Environment

Can also use conda, left to individual preference.

conda create -n automl-vision-env python=3.11
conda activate automl-vision-env

Then install the repository by running the following command:

pip install -e .

You can test that the installation was successful by running the following command:

python -c "import automl"

We make no restrictions on the python library or version you use, but we recommend using python 3.8 or higher.

Code

We provide the following:

run.py: A script that trains an AutoML-System on the training split dataset_train of a given dataset and then generates predictions for the test split dataset_test, saving those predictions to a file. For the training datasets, the test splits will contain the ground truth labels, but for the test dataset which we provide later the labels of the test split will not be available. You will be expected to generate these labels yourself and submit them to us through GitHub classrooms.
src/automl: This is a python package that will be installed above and contain your source code for whatever system you would like to build. We have provided a dummy AutoML class to serve as an example.

You are completely free to modify, install new libraries, make changes and in general do whatever you want with the code. The only requirement for the exam will be that you can generate predictions for the test splits of our datasets in a .npy file that we can then use to give you a test score through GitHub classrooms.

Data

We selected three different vision datasets which you can use to develop your AutoML system and we will provide you with a test dataset to evaluate your system at a later point in time. The datasets can be automatically downloaded by the respective dataset classes in ./src/automl/datasets.py. The datasets are: fashion, flowers, and emotions.

If there are any problems downloading the datasets, you can download them manually from https://ml.informatik.uni-freiburg.de/research-artifacts/automl-exam-24-vision/ and place them in the /data folder after unzipping them.

The downloaded datasets will have the following structure:

./data
├── fashion
│   ├── images_test
│   │   ├── 000001.jpg
│   │   ├── 000002.jpg
│   │   ├── 000003.jpg
│   │   ...
│   ├── images_train
│   │   ├── 000001.jpg
│   │   ├── 000002.jpg
│   │   ├── 000003.jpg
│   │   ...
│   ├── description.md
│   ├── fashion.tgz
│   ├── test.csv
│   └── train.csv
├── emotions
    ...
...

Feel free to explore the images and the description.md files to get a better understanding of the datasets. The following table will provide you an overview of their characteristics and also a reference value for the accuracy that a naive AutoML system could achieve on these datasets:

Dataset name	# Classes	# Train samples	# Test samples	# Channels	Resolution	Reference Accuracy
fashion	10	60,000	10,000	1	28x28	0.88
flowers	102*	5732	2,457	3	512x512	0.55
emotions	7	28709	7,178	1	48x48	0.40
final test dataset	TBA	TBA	TBA	TBA	TBA	TBA

*classes are imbalanced

We will add the test dataset later by pushing its class definition to the datasets.py file. The test dataset will be in the same format as the training datasets, but test.csv will only contain nan's for labels.

Running an initial test

This will download the fashion dataset into ./data, train a dummy AutoML system and generate predictions for the test split:

python run.py --dataset fashion --seed 42 --output-path preds-42-fashion.npy

You are free to modify these files and command line arguments as you see fit.

Final submission

The final test predictions should be uploaded in a file final_test_preds.npy, with each line containing the predictions for the input in the exact order of X_test given.

Upload your poster as a PDF file named as final_poster_vision_<team-name>.pdf, following the template given here.

Tips

If you need to add dependencies that you and your teammates are all on the same page, you can modify the pyproject.toml file and add the dependencies there. This will ensure that everyone has the same dependencies
Please feel free to modify the .gitignore file to exclude files generated by your experiments, such as models, predictions, etc. Also, be friendly teammate and ignore your virtual environment and any additional folders/files created by your IDE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

AutoML Exam - SS24 (Vision Data)

Installation

Code

Data

Running an initial test

Final submission

Tips

Files

README.md

Latest commit

History

README.md

File metadata and controls

AutoML Exam - SS24 (Vision Data)

Installation

Code

Data

Running an initial test

Final submission

Tips