A small, friendly open-source detector for AI-generated images. It is designed in the
yt-dlp / rembg spirit: install it, run one command, get a probability and a
reproducible report.
AI image detection is probabilistic. Treat the output as one signal, not as proof.
The default backend is UnivFD / UniversalFakeDetect: CLIP ViT-L/14 image features plus a tiny linear fake/real head. This is a strong practical default because the task-specific weight is tiny, the code path is understandable, and the CVPR 2023 paper showed good cross-generator generalization compared with older GAN-trained detectors.
This repo also ships a hybrid backend that blends UnivFD with a lightweight Hugging Face image classifier. It is useful when you want a stronger practical ensemble without training a new detector from scratch.
Recent research has moved further. AIDE combines CLIP semantics with low-level frequency/noise features and reports gains on GenImage and AIGCDetectBenchmark. That is a good research target for a future backend, but UnivFD is currently the simplest robust default for an installable open-source tool.
Useful references:
- UniversalFakeDetect paper: https://openaccess.thecvf.com/content/CVPR2023/html/Ojha_Towards_Universal_Fake_Image_Detectors_That_Generalize_Across_Generative_Models_CVPR_2023_paper.html
- UniversalFakeDetect code: https://github.com/WisconsinAIVision/UniversalFakeDetect
- AIDE paper: https://arxiv.org/abs/2406.19435
- GenImage benchmark: https://github.com/GenImage-Dataset/GenImage
- Tiny-GenImage runnable subset: https://huggingface.co/datasets/TheKernel01/Tiny-GenImage
- CIFAKE benchmark paper: https://huggingface.co/papers/2303.14126
Use Python 3.10+.
python -m venv .venv
source .venv/bin/activate
pip install -e .Optional extras:
pip install -e '.[eval]' # Hugging Face dataset benchmarks
pip install -e '.[hf]' # generic Hugging Face image-classification backend
pip install -e '.[api]' # FastAPI server
pip install -e '.[web]' # Gradio UI
pip install -e '.[dev]' # tests and lintingDetect one image:
aidetect detect image.jpgDetect a folder recursively:
aidetect detect ./images --csv report.csvJSON lines output:
aidetect detect ./images --jsonUse a Hugging Face image-classification model instead of UnivFD:
aidetect detect image.jpg --backend hf --hf-model capcheck/ai-image-detectionUse the hybrid backend:
aidetect detect image.jpg --backend hybrid --hybrid-univfd-weight 0.8from aidetector import create_detector
detector = create_detector("univfd", device="auto")
result = detector.predict_path("image.jpg")
print(result.as_dict())pip install -e '.[web]'
aidetect servepip install -e '.[api]'
aidetect api --host 127.0.0.1 --port 8000Then call:
curl -F "file=@image.jpg" http://127.0.0.1:8000/detectEvaluate a GenImage-style folder where nature/ contains real images and ai/
contains generated images:
aidetect benchmark-folder /path/to/GenImage/Midjourney/val \
--real-dir nature \
--fake-dir ai \
--output benchmarks/midjourney-val.jsonEvaluate a Hugging Face dataset such as Tiny-GenImage:
pip install -e '.[eval]'
aidetect benchmark-hf TheKernel01/Tiny-GenImage \
--split validation \
--image-field image \
--label-field label \
--fake-label 1 \
--max-samples 200 \
--output benchmarks/tiny-genimage-univfd-200.jsonThe JSON report includes accuracy, balanced accuracy, precision, recall, F1, ROC AUC, confusion counts, a diagnostic threshold sweep, model metadata, dataset metadata, and per-image predictions.
For more defensible evaluation, calibrate a threshold on one split and evaluate on another:
aidetect benchmark-calibrated-folder /path/to/exported-folder \
--backend univfd \
--output benchmarks/univfd-calibrated.jsonFor multi-shard Tiny-GenImage evaluation with per-generator slices:
aidetect benchmark-tiny-genimage-local \
/path/to/validation-00000-of-00004.parquet \
/path/to/validation-00001-of-00004.parquet \
/path/to/validation-00002-of-00004.parquet \
/path/to/validation-00003-of-00004.parquet \
--backend hybrid \
--optimize-metric f1 \
--max-per-class-per-shard 100 \
--output benchmarks/tiny-genimage-hybrid-multishard-800-f1.jsonIf Hugging Face dataset metadata requests are flaky, you can work from a local Tiny-GenImage parquet shard:
aidetect prepare-tiny-genimage .cache/tiny-genimage-validation-200 \
--local-parquet /path/to/validation-00000-of-00004.parquet \
--max-per-class 100
aidetect benchmark-calibrated-folder .cache/tiny-genimage-validation-200 \
--backend univfd \
--real-dir real \
--fake-dir ai \
--output benchmarks/tiny-genimage-univfd-calibrated-200.jsonCurrent local benchmark evidence is split into two levels.
Smoke benchmark on Tiny-GenImage validation shard
data/validation-00000-of-00004.parquet, 20 real + 20 fake images:
| Backend | Threshold | Accuracy | Balanced Acc | F1 | ROC AUC | Images/s |
|---|---|---|---|---|---|---|
| UnivFD / CLIP ViT-L/14 | 0.5 | 0.500 | 0.500 | 0.000 | 0.715 | 2.31 |
| capcheck/ai-image-detection | 0.5 | 0.600 | 0.600 | 0.692 | 0.743 | 32.03 |
Calibrated hold-out benchmark on the same shard family, exported as 100 real + 100 fake images and split deterministically into calibration/test sets:
| Backend | Calibration | Test Accuracy | Test Balanced Acc | Test F1 | Test ROC AUC |
|---|---|---|---|---|---|
| UnivFD / CLIP ViT-L/14 | threshold-only | 0.760 | 0.760 | 0.721 | 0.811 |
| Hybrid (UnivFD 0.8 + HF 0.2) | threshold + blend weight | 0.670 | 0.670 | 0.629 | 0.752 |
| capcheck/ai-image-detection | threshold-only | 0.580 | 0.580 | 0.580 | 0.610 |
Interpretation:
- The 40-image run is only a smoke test.
- The 200-image calibrated split is a stronger local benchmark because threshold selection happens on a separate calibration split before the test split is scored.
- It is still not a publication-grade claim. It is one shard, one deterministic split, and one local environment.
- These calibrated runs were executed on CPU in this workspace.
Current strongest local benchmark, calibrated on 4 Tiny-GenImage validation shards with up to 100 real + 100 fake images sampled per shard:
| Backend | Test N | Test Accuracy | Test Balanced Acc | Precision | Recall | Test F1 | Test ROC AUC |
|---|---|---|---|---|---|---|---|
Hybrid (UnivFD 0.85 + HF 0.15), optimize=f1 |
400 | 0.773 | 0.773 | 0.779 | 0.760 | 0.770 | 0.843 |
Hybrid (UnivFD 0.85 + HF 0.15), optimize=balanced_accuracy |
400 | 0.745 | 0.745 | 0.802 | 0.650 | 0.718 | 0.843 |
| UnivFD / CLIP ViT-L/14 | 300 | 0.690 | 0.690 | 0.806 | 0.500 | 0.617 | 0.784 |
Selected generator-vs-real slices from that same held-out split:
| Generator | N | Accuracy | Balanced Acc | F1 | ROC AUC |
|---|---|---|---|---|---|
| BigGAN vs Real | 231 | 0.810 | 0.876 | 0.577 | 0.962 |
| ADM vs Real | 232 | 0.810 | 0.877 | 0.585 | 0.974 |
| VQDM vs Real | 224 | 0.804 | 0.872 | 0.511 | 0.941 |
| GLIDE vs Real | 227 | 0.775 | 0.744 | 0.427 | 0.805 |
| Wukong vs Real | 228 | 0.776 | 0.750 | 0.440 | 0.815 |
| SD15 vs Real | 228 | 0.768 | 0.714 | 0.404 | 0.763 |
| Midjourney vs Real | 230 | 0.730 | 0.576 | 0.262 | 0.638 |
This is the honest picture: switching the calibration objective to f1 gives us
the strongest thresholded result so far, with a materially better precision /
recall balance than pure UnivFD. It also lifts weaker generators such as
Midjourney and SD15, though they remain much harder than ADM, BigGAN, or VQDM.
This is still not a universal detector guarantee.
On first use, the UnivFD backend downloads:
- CLIP ViT-L/14 OpenAI weights through
open_clip_torch - UniversalFakeDetect linear head from
siddharthksah/deepsafe-weights/universalfakedetect/fc_weights.pth
You can also pass a local head checkpoint:
aidetect detect image.jpg --weight-path ./fc_weights.pthpip install -e '.[dev,eval,hf,api]'
pytest
ruff check .- No detector is universal. New generators, heavy recompression, screenshots, crops, edits, upscaling, and adversarial post-processing can change results.
- Benchmarks can overstate real-world reliability if the deployment data differs from the benchmark distribution.
- The tool currently detects whole-image synthetic likelihood. It does not localize edited regions.
If this helps your work, cite the original UnivFD paper:
@InProceedings{Ojha_2023_CVPR,
author = {Ojha, Utkarsh and Li, Yuheng and Lee, Yong Jae},
title = {Towards Universal Fake Image Detectors That Generalize Across Generative Models},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {24480-24489}
}