A suite of reliability tests for NLP models.
Install the package
pip install git+https://github.com/Maitreyapatel/reliability-score
Evaluate example model/data with default configuration
# eval on CPU
rs
# eval on GPU
rs trainer=gpu
Evaluate model with chosen dataset-specific experiment configuration from reliability_score/configs/experiment/
rs experiment=<experiment_name>
Specify the custom model_name as shown in following MNLI example
# if model_name is used for tokenizer as well.
rs experiment=mnli custom_model="bert-base-uncased-mnli"
# if model_name is different for tokenizer then
rs experiment=mnli custom_model="bert-base-uncased-mnli" custom_model.tokenizer.model_name="ishan/bert-base-uncased-mnli"
# create config folder structure similar to reliability_score/configs/
mkdir ./configs/
mkdir ./configs/custom_model/
# run following command after creating new config file inside ./configs/custom_model/<your-config>.yaml
rs experiment=mnli custom_model=<your-config>
The locally hosted documentation can be found at: LINK