Skip to content

automl-classroom/automl-exam-ss24-vision-freiburg-template

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoML Exam - SS24 (Vision Data)

This repo serves as a template for the exam assignment of the AutoML SS24 course at the university of Freiburg.

The aim of this repo is to provide a minimal installable template to help you get up and running.

Installation

To install the repository, first create an environment of your choice and activate it.

For example, using venv:

You can change the python version here to the version you prefer.

Virtual Environment

python3 -m venv automl-vision-env
source automl-vision-env/bin/activate

Conda Environment

Can also use conda, left to individual preference.

conda create -n automl-vision-env python=3.11
conda activate automl-vision-env

Then install the repository by running the following command:

pip install -e .

You can test that the installation was successful by running the following command:

python -c "import automl"

We make no restrictions on the python library or version you use, but we recommend using python 3.8 or higher.

Code

We provide the following:

  • run.py: A script that trains an AutoML-System on the training split dataset_train of a given dataset and then generates predictions for the test split dataset_test, saving those predictions to a file. For the training datasets, the test splits will contain the ground truth labels, but for the test dataset which we provide later the labels of the test split will not be available. You will be expected to generate these labels yourself and submit them to us through GitHub classrooms.

  • src/automl: This is a python package that will be installed above and contain your source code for whatever system you would like to build. We have provided a dummy AutoML class to serve as an example.

You are completely free to modify, install new libraries, make changes and in general do whatever you want with the code. The only requirement for the exam will be that you can generate predictions for the test splits of our datasets in a .npy file that we can then use to give you a test score through GitHub classrooms.

Data

We selected three different vision datasets which you can use to develop your AutoML system and we will provide you with a test dataset to evaluate your system at a later point in time. The datasets can be automatically downloaded by the respective dataset classes in ./src/automl/datasets.py. The datasets are: fashion, flowers, and emotions.

If there are any problems downloading the datasets, you can download them manually from https://ml.informatik.uni-freiburg.de/research-artifacts/automl-exam-24-vision/ and place them in the /data folder after unzipping them.

The downloaded datasets will have the following structure:

./data
├── fashion
│   ├── images_test
│   │   ├── 000001.jpg
│   │   ├── 000002.jpg
│   │   ├── 000003.jpg
│   │   ...
│   ├── images_train
│   │   ├── 000001.jpg
│   │   ├── 000002.jpg
│   │   ├── 000003.jpg
│   │   ...
│   ├── description.md
│   ├── fashion.tgz
│   ├── test.csv
│   └── train.csv
├── emotions
    ...
...

Feel free to explore the images and the description.md files to get a better understanding of the datasets. The following table will provide you an overview of their characteristics and also a reference value for the accuracy that a naive AutoML system could achieve on these datasets:

Dataset name # Classes # Train samples # Test samples # Channels Resolution Reference Accuracy
fashion 10 60,000 10,000 1 28x28 0.88
flowers 102* 5732 2,457 3 512x512 0.55
emotions 7 28709 7,178 1 48x48 0.40
final test dataset TBA TBA TBA TBA TBA TBA

*classes are imbalanced

We will add the test dataset later by pushing its class definition to the datasets.py file. The test dataset will be in the same format as the training datasets, but test.csv will only contain nan's for labels.

Running an initial test

This will download the fashion dataset into ./data, train a dummy AutoML system and generate predictions for the test split:

python run.py --dataset fashion --seed 42 --output-path preds-42-fashion.npy

You are free to modify these files and command line arguments as you see fit.

Final submission

The final test predictions should be uploaded in a file final_test_preds.npy, with each line containing the predictions for the input in the exact order of X_test given.

Upload your poster as a PDF file named as final_poster_vision_<team-name>.pdf, following the template given here.

Tips

  • If you need to add dependencies that you and your teammates are all on the same page, you can modify the pyproject.toml file and add the dependencies there. This will ensure that everyone has the same dependencies

  • Please feel free to modify the .gitignore file to exclude files generated by your experiments, such as models, predictions, etc. Also, be friendly teammate and ignore your virtual environment and any additional folders/files created by your IDE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages