A Python tool designed to optimize and shorten Danbooru tags in image captions to save tokens during AI model training. This tool intelligently merges, filters, and consolidates tags while preserving semantic meaning, helping reduce token usage and improve training efficiency.
- Smart Tag Merging - Combines related tags (e.g., "short hair, black hair" → "short black hair")
- Hierarchical Filtering - Removes redundant tags when more specific versions exist (breasts → large breasts)
- Blacklist Support - Filters out unwanted or irrelevant tags such as "commentary request"
- Animal-Specific Processing - Handles animal character tags with specialized logic (animal ears → dog ears)
- Color Optimization - Intelligently handles multicolored attributes (remove black,white,etc hair if multicolored hair)
- Batch Processing - Process multiple text files at once
- Token Estimation - Reports approximate token savings
- YAML Configuration - Easily customizable rules and settings
- pyyaml
-
Clone the repository:
git clone https://github.com/seedmanc/token-merging-4-training.git cd token-merging-4-training -
Install dependencies:
pip install -r requirements.txt
Process all text files in a directory:
python main.py "path/to/your/txt/captions"
positional arguments:
captions_path Required.
optional arguments:
-h, --help show this help message and exit
--dry Don't change the files
--author AUTHOR Replace author tag with --class-tokens + " style"
--class-tokens CLASS_TOKENS
Replace --author tag with this. Defaults to --author w/o spaces or (...). Use --class-tokens= to remove author entirely.
--brief Reduce console spam
--verbose
Edit YAML dictionaries as you see fit. The replace.yaml works as follows: the key will be replaced by one of the values under it but only if the values are found in the tags. So "adjusting eyewear, glasses" becomes "adjusting glasses, glasses" (with the duplicate removed further in processing). Expand animals and colors dicts to ensure special processing of those categories (mainly to avoid "animal dog ears" entries, get rid of animal features if animal girl is already mentioned and remove colors if multicolored is present in the tags).
python main.py C:\Users\USERNAME\Downloads\hukuro --author="poporu (hukuroneko)" --class-tokens=hukuro
FILE: __yak_kemono_friends_and_1_more_drawn_by_poporu_hukuroneko__da95e66e2af395a6c9c35f2eb732626f.txt
- commentary request
poporu (hukuroneko) => hukuro style b/c --class-tokens
bow => bowtie
brown bow => brown bowtie
- ribbon b/c brown ribbon
- shirt b/c yellow shirt
- bowtie b/c brown bowtie
- horns b/c black horns
- breasts b/c large breasts
- animal ears b/c cow ears
- cow ears,cow horns b/c cow girl
- black horns,grey horns b/c multicolored
- white hair
- long hair
+ long white hair
- kemono friends 3,kemono friends b/c yak (kemono friends)
Saved ~43 tokens or 41%
['1girl', 'blush', 'brown bowtie', 'brown eyes', 'brown ribbon', 'cow girl', 'dress', 'extra ears', 'gloves', 'hair over one eye', 'highres', 'hukuro style', 'large breasts', 'long white hair', 'multicolored horns', 'short sleeves', 'smile', 'solo', 'twintails', 'yak (kemono friends)', 'yellow shirt']
The tool applies transformations in this order:
- Filtering - Remove blacklisted tags and clip series information
- Hierarchy - Remove redundant generic tags when specific ones exist
- Animal and color processing - Handle specific tag logic
- Merging - Combine tags with the same noun but different adjectives
- Artist conversion - turn author tags into ready to use "style" class tokens for style-lora training or add if none present.
The tool uses YAML configuration files in the config/ directory:
- virtual youtuber
- looking at viewer
- multiple girls
- commentary request
# ... more blacklisted tagsThese are removed unconditionally.
- red
- blue
- green
# ... more colorsRemoves colored tags if multicolored/two-tone tag of the same subject is present.
- cat
- dog
- wolf
- tiger
# ... more animalsRemoves literal "animal part" if specific animal parts are present. Removes specific animal parts if that animal girl is present.
eyewear:
- glasses
- goggles
- sunglasses
- monocle
# ... more generic tags paired with a list of concrete onesReplaces generic tag with one of the concrete ones if the concrete is also present. If instead of the list there is a string like "one eye closed: wink" then replaces unconditionally.
- Issues: GitHub Issues
For questions about usage or contributions, please open an issue on GitHub.
- Inspired by the need to optimize token usage in AI lora training
- Built for the *booru tagging community