Glyph is a CLI for classifying printed Aramaic letters from RGB images.
The project is built around real crops and canonical exemplars.
- builds RGB dataset splits from exemplar and real letter images
- keeps real crops in
r_trainandr_valwith light preprocessing only - trains a classifier against a real validation split
- validates checkpoints
- predicts one image
- scans image folders or runs labeled benchmarks
make glyph
./glyph gen
./glyph train
./glyph val --pt artifacts/best_model.ptPredict one image:
./glyph pred --pt artifacts/best_model.pt --img path/to/image.pngGlyph uses four dataset splits:
train/— generated RGB training imagesval/— generated RGB validation imagesr_train/— real cropped letters for training supportr_val/— real cropped letters for primary validation
When r_val/ exists, it is treated as the main validation split.
Project assets live in source/:
source/alphabet/— reference glyph imagessource/exemplars/— exemplar letter variantssource/real/— real cropped letterssource/textures/— texture backgrounds for RGB synthetic generation
./glyph gen
./glyph train
./glyph val --pt artifacts/best_model.pt
./glyph pred --pt artifacts/best_model.pt --img path/to/image.png
./glyph scan --pt artifacts/best_model.pt path/to/folder
./glyph bench --pt artifacts/best_model.pt --csv labels.csv path/to/folder
./glyph check
./glyph qaSmall dataset run:
./glyph gen --train 50 --val 10External folder benchmark:
./glyph bench --pt artifacts/best_model.pt --csv labels.csv path/to/folderdataset/— generated synthetic and real split dataartifacts/best_model.pt— best checkpointartifacts/last_model.pt— last checkpointartifacts/val/— validation reportsartifacts/scans/— folder prediction runsartifacts/benchmarks/— benchmark runs
./glyph lint
./glyph syntax
./glyph smoke
./glyph qaglyph smoke runs the smoke marker only.
glyph qa runs lint, syntax, and the full pytest suite.
For development and local verification, make glyph is the only bootstrap step:
make glyph
python -m pytest -q
python -m core qaMIT