Skip to content

[BUG] Breaking change for Hellaswag/Piqa #1090

@Giuseppe5

Description

@Giuseppe5

Describe the bug

In the most recent versions of lighteval, there was a breaking change for some tasks, like Hellaswag and Piqa.
The default behaviour went from something similar to what lm_eval does (like using loglikelihood_acc as metric) to instead using exact_match.

Until version 0.9.2, this was not the case.

Furthermore, in version 0.13.0 the prompt functions and utilities needed to reproduce the lm_eval-like behaviour have been removed

To Reproduce

Run lighteval on any model testing against hellaswag or piqa.

Expected behavior

It would be nice to have the possibility to run these tasks following a similar setup used in lm_eval as it was possible in previous versions of lighteval.

Version info

lighteval==0.13.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions