-
Notifications
You must be signed in to change notification settings - Fork 405
Open
Labels
Description
Describe the bug
In the most recent versions of lighteval, there was a breaking change for some tasks, like Hellaswag and Piqa.
The default behaviour went from something similar to what lm_eval does (like using loglikelihood_acc as metric) to instead using exact_match.
Until version 0.9.2, this was not the case.
Furthermore, in version 0.13.0 the prompt functions and utilities needed to reproduce the lm_eval-like behaviour have been removed
To Reproduce
Run lighteval on any model testing against hellaswag or piqa.
Expected behavior
It would be nice to have the possibility to run these tasks following a similar setup used in lm_eval as it was possible in previous versions of lighteval.
Version info
lighteval==0.13.0