Skip to content

Latest commit

 

History

History
63 lines (49 loc) · 3.07 KB

File metadata and controls

63 lines (49 loc) · 3.07 KB

Command-Line Interface (CLI)

Use the asr-evalkit command to evaluate models directly from the terminal.

Usage

asr-evalkit [arguments]

Discovery

Argument Description
--list-evaluators List all registered evaluators (e.g., whisper, canary) and exit.

Model Configuration

Argument Provider Description Default
--evaluator NAME All Name of the evaluator to run. Required. -
--model ID All HuggingFace model ID or local path. Required. -
--device STR All Device to load the model on (cuda or cpu). cuda if available
--use-fp16 HuggingFace Load model weights in half precision. False
--use-flash-attention HuggingFace Enable Flash Attention 2 (model must support it). False
--gpu-memory-utilization FLOAT vLLM Fraction of GPU memory vLLM may use. 0.8
--max-model-len INT vLLM Maximum sequence length for the VLLM engine. Model default
--max-new-tokens INT vLLM Maximum tokens to generate per sample. 4096
--dtype STR vLLM Weight dtype (bfloat16, float16, float32). bfloat16

Dataset Configuration

Argument Description Default
--dataset-source STR Format of the dataset: huggingface or nemo. huggingface
--dataset PATH HuggingFace dataset ID or path to NeMo manifest. Required. -
--dataset-config STR Dataset configuration/subset name (e.g. clean). None
--dataset-split STR Dataset split to evaluate on. test
--audio-column STR Name of the audio data column/key. audio
--text-column STR Name of the reference text column/key. text
--streaming Use HuggingFace streaming mode (highly recommended for large datasets). False

Evaluation Parameters

Argument Description Default
--batch-size INT Inference batch size. 8
--max-samples INT Maximum number of samples to evaluate (truncate dataset). All
--language STR BCP-47 language code (e.g., en, fr). Auto-detect

Normalisation & Output

Argument Description Default
--lowercase Lowercase all text before scoring. Use --no-lowercase to disable. True
--normalize-unicode Apply standard Unicode NFC normalization. Use --no-normalize-unicode to disable. True
--remove-punctuation Remove all punctuation from predictions and references. Use --no-remove-punctuation to disable. True
--remove-diacritics Remove diacritical marks/accents. Use --no-remove-diacritics to disable. False
--remove-extra-whitespace Strip extra spaces and normalize whitespace. Use --no-remove-extra-whitespace to disable. True
--output-file PATH Path to save JSON results. None (stdout only)
--save-predictions Include per-sample prediction text strings in the JSON output. Use --no-save-predictions to disable. True
--prediction-format STR Format for predictions: paired (list of dicts) or parallel (two lists). paired