Image Dataset Splitter

📂 A tool for splitting image datasets into training and testing sets for classification tasks. Useful for preparing data for machine learning models.

Features

Splits image datasets into training and testing subsets.
Supports organizing images by classification labels.
Generates a YAML configuration file for dataset management.

Installation

Clone the repository:

git clone https://github.com/Gulciha-n/image-classification-data-split.git
cd image-classification-data-split

Install required packages:
```
pip install pyyaml
```

Usage

Prepare your dataset:

Ensure your source directory is structured as follows:

source/
└── train/
    ├── belirsiz/
    ├── sifir/
    ├── on/
    ├── yirmi/
    ├── otuz/
    ├── kirk/
    ├── elli/
    ├── altmis/
    ├── yetmis/
    ├── seksen/
    ├── doksan/
    └── yuz/

Here, source is your dataset directory, and each subdirectory (e.g., belirsiz, sifir, etc.) contains images corresponding to the classification labels.

Update the script with your paths:

Edit the main.py file and update the source_dir and target_dir variables with your directory paths.

source_dir = r"PATH_TO_YOUR_SOURCE_DIRECTORY"  # source directory
target_dir = r"PATH_TO_YOUR_TARGET_DIRECTORY"  # target directory

Run the script:

Execute the following command to run the script:

python main.py

This will organize and split your image dataset into training and testing sets, and generate a dataset.yaml file in the target directory.

After running the script, your target directory structure will be:

train_test_images/
├── train/
│   ├── belirsiz/
│   ├── sifir/
│   ├── on/
│   ├── yirmi/
│   ├── otuz/
│   ├── kirk/
│   ├── elli/
│   ├── altmis/
│   ├── yetmis/
│   ├── seksen/
│   ├── doksan/
│   └── yuz/
└── test/
    ├── belirsiz/
    ├── sifir/
    ├── on/
    ├── yirmi/
    ├── otuz/
    ├── kirk/
    ├── elli/
    ├── altmis/
    ├── yetmis/
    ├── seksen/
    ├── doksan/
    └── yuz/

A dataset.yaml file will also be created in the train_test_images directory.

Configuration

source_dir: Directory where your image data is located.
target_dir: Directory where the split data will be saved.
labels: List of classification labels used for organizing images.

License

This project is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
classification-train-test-split.py		classification-train-test-split.py
train_test_split.py		train_test_split.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Dataset Splitter

Features

Installation

Usage

Configuration

License

About

Uh oh!

Releases

Packages

Languages

gulcihanglmz/image-classification-data-split

Folders and files

Latest commit

History

Repository files navigation

Image Dataset Splitter

Features

Installation

Usage

Configuration

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages