Change the repository type filter
Forks
Repositories list
17 repositories
JUDGE-BENCH
PublicRC-analysis
Publiclabel-variation-nli
Publicel_esco
Publicnnose
PublicEevee
Publicsubspace-chronicles
Publicconllueditor
PublicSkillSpan
Public- Code accompanying the EMNLP 2022 paper "Stop Measuring Calibration When Humans Disagree" in which we show problems with popular calibration metrics like ECE in settings where more than one answer is acceptable, and argue for several metrics that take into account the full human judgement distribution.
spectral-probing
Publicdepprobe
Publicscientific-re
Public