Fix non-existent evaluation splits in lextreme

## Describe the bug
The `lextreme` benchmark failed with `KeyError` or configuration errors because the `evaluation_splits` defined in the configuration (`["validation", "test"]`) did not match the actual available splits in the dataset for several subsets.

## To Reproduce
```python
task = "lextreme:multi_eurlex_level_1|5"

pipeline = Pipeline(
    tasks=task,
    pipeline_parameters=pipeline_params,
    evaluation_tracker=evaluation_tracker,
    model_config=model_config,
)

pipeline.evaluate()
pipeline.save_and_push_results()
pipeline.show_results()
```
```python
    141 self._init_random_seeds()
--> 142 self._init_tasks_and_requests(tasks=tasks)
    144 self.model_config = model_config
    145 self.accelerator, self.parallel_context = self._init_parallelism_manager()
...
     88         available_suggested_splits = [
     89             split for split in (Split.TRAIN, Split.TEST, Split.VALIDATION) if split in self
     90         ]

KeyError: 'validation'
```

## Expected behavior
The configuration should only reference splits that are actually available on the Hugging Face Hub for each subset.

## Version info
- OS: mac
- Lighteval version: main (local development)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix non-existent evaluation splits in lextreme #1150

Describe the bug

To Reproduce

Expected behavior

Version info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fix non-existent evaluation splits in lextreme #1150

Description

Describe the bug

To Reproduce

Expected behavior

Version info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions