Command-Line Interface (CLI)

Use the asr-evalkit command to evaluate models directly from the terminal.

Usage

asr-evalkit [arguments]

Argument	Description
`--list-evaluators`	List all registered evaluators (e.g., `whisper`, `canary`) and exit.

Argument	Provider	Description	Default
`--evaluator NAME`	All	Name of the evaluator to run. Required.	-
`--model ID`	All	HuggingFace model ID or local path. Required.	-
`--device STR`	All	Device to load the model on (`cuda` or `cpu`).	`cuda` if available
`--use-fp16`	HuggingFace	Load model weights in half precision.	False
`--use-flash-attention`	HuggingFace	Enable Flash Attention 2 (model must support it).	False
`--gpu-memory-utilization FLOAT`	vLLM	Fraction of GPU memory vLLM may use.	`0.8`
`--max-model-len INT`	vLLM	Maximum sequence length for the VLLM engine.	Model default
`--max-new-tokens INT`	vLLM	Maximum tokens to generate per sample.	`4096`
`--dtype STR`	vLLM	Weight dtype (`bfloat16`, `float16`, `float32`).	`bfloat16`

Argument	Description	Default
`--dataset-source STR`	Format of the dataset: `huggingface` or `nemo`.	`huggingface`
`--dataset PATH`	HuggingFace dataset ID or path to NeMo manifest. Required.	-
`--dataset-config STR`	Dataset configuration/subset name (e.g. `clean`).	None
`--dataset-split STR`	Dataset split to evaluate on.	`test`
`--audio-column STR`	Name of the audio data column/key.	`audio`
`--text-column STR`	Name of the reference text column/key.	`text`
`--streaming`	Use HuggingFace streaming mode (highly recommended for large datasets).	False

Argument	Description	Default
`--batch-size INT`	Inference batch size.	`8`
`--max-samples INT`	Maximum number of samples to evaluate (truncate dataset).	All
`--language STR`	BCP-47 language code (e.g., `en`, `fr`).	Auto-detect

Argument	Description	Default
`--lowercase`	Lowercase all text before scoring. Use `--no-lowercase` to disable.	True
`--normalize-unicode`	Apply standard Unicode NFC normalization. Use `--no-normalize-unicode` to disable.	True
`--remove-punctuation`	Remove all punctuation from predictions and references. Use `--no-remove-punctuation` to disable.	True
`--remove-diacritics`	Remove diacritical marks/accents. Use `--no-remove-diacritics` to disable.	False
`--remove-extra-whitespace`	Strip extra spaces and normalize whitespace. Use `--no-remove-extra-whitespace` to disable.	True
`--output-file PATH`	Path to save JSON results.	None (stdout only)
`--save-predictions`	Include per-sample prediction text strings in the JSON output. Use `--no-save-predictions` to disable.	True
`--prediction-format STR`	Format for predictions: `paired` (list of dicts) or `parallel` (two lists).	`paired`