-
Notifications
You must be signed in to change notification settings - Fork 95
Open
Description
The croissant for the lmms-lab/LMMs-Eval-Lite dataset is not correct.
And more specifically in the "gqa/semantic/dependencies" field:
{
"@type": "cr:Field",
"@id": "gqa/semantic/dependencies",
"name": "gqa/semantic/dependencies",
"description": "Column 'semantic' from the Hugging Face parquet file.",
"dataType": "sc:Integer",
"source": {
"fileSet": {
"@id": "parquet-files-for-config-gqa"
},
"extract": {
"column": "semantic"
}
// <==== Here the transform is missing!
}
I don't know why "transform": { "jsonPath": "dependencies" } is missing here, given that it has been correctly added to all other subfields.
For comparison, the croissant on mlcroissant repo for the same dataset, with the correct gqa/semantic/dependencies subfield: https://github.com/mlcommons/croissant/blob/0b5cdfdcea72025fa4f7eaf8384e92eda291a118/datasets/1.0/huggingface-lmms-eval-lite/metadata.json
Can be loaded both with and without beam without any issue.
Metadata
Metadata
Assignees
Labels
No labels