Skip to content

Fix non-existent evaluation splits in lextreme #1150

@pjavanrood

Description

@pjavanrood

Describe the bug

The lextreme benchmark failed with KeyError or configuration errors because the evaluation_splits defined in the configuration (["validation", "test"]) did not match the actual available splits in the dataset for several subsets.

To Reproduce

task = "lextreme:multi_eurlex_level_1|5"

pipeline = Pipeline(
    tasks=task,
    pipeline_parameters=pipeline_params,
    evaluation_tracker=evaluation_tracker,
    model_config=model_config,
)

pipeline.evaluate()
pipeline.save_and_push_results()
pipeline.show_results()
    141 self._init_random_seeds()
--> 142 self._init_tasks_and_requests(tasks=tasks)
    144 self.model_config = model_config
    145 self.accelerator, self.parallel_context = self._init_parallelism_manager()
...
     88         available_suggested_splits = [
     89             split for split in (Split.TRAIN, Split.TEST, Split.VALIDATION) if split in self
     90         ]

KeyError: 'validation'

Expected behavior

The configuration should only reference splits that are actually available on the Hugging Face Hub for each subset.

Version info

  • OS: mac
  • Lighteval version: main (local development)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions