-
Notifications
You must be signed in to change notification settings - Fork 29.5k
Remove script datasets in tests #38940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This reverts commit 31d30b7.
e7b9c4d
to
3dfebf2
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ydshieh !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I trigger CI runs for these touched test files?
I prefer to see failures (if any) today instead of tomorrow 😅
Sure ! Btw I also found some tests failing because of |
(I just saw your comment about jiwer, thanks!) There is
but on main it fails with
but it is passing in the past few days I believe. @lhoestq Could you take a look next week. The 3 failing tests seems irrelevant to this PR, and I checked on CI runners which are passing. I will merge so we know the effect of this PR and see if there are other stuff to fix next week. Thank you! |
ok ! And I just released |
and I also fixed test_LayoutLMv3_integration_test ! feel free to merge |
ok! |
run-slow: beit, dpt, granite_speech, layoutlmv2, layoutlmv3, layoutxlm, mobilevit, nougat, segformer, udop, upernet |
This comment contains run-slow, running the specified jobs: models: ['models/beit', 'models/dpt', 'models/granite_speech', 'models/layoutlmv2', 'models/layoutlmv3', 'models/layoutxlm', 'models/mobilevit', 'models/nougat', 'models/segformer', 'models/udop', 'models/upernet'] |
There are still some failing tests relevant (I believe). Examples: FAILED tests/models/layoutlmv2/test_processor_layoutlmv2.py::LayoutLMv2ProcessorTest::test_overflowing_tokens - FileNotFoundError: [Errno 2] No such file or directory: '/Users/quentinlhoest/.cache/huggingface/datasets/downloads/extracted/35aecbb2e1ba08d57652cba29ac16f2f4b257260d21662922d6eb772ff4b9be1/dataset/training_data/images/0000971160.png' FAILED tests/models/layoutlmv2/test_processor_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_1 - ValueError: Unsupported number of image dimensions: 2 FAILED tests/models/beit/test_modeling_beit.py::BeitModelIntegrationTest::test_inference_semantic_segmentation - KeyError: 'file' FAILED tests/models/beit/test_modeling_beit.py::BeitModelIntegrationTest::test_post_processing_semantic_segmentation - KeyError: 'file' Let's not merge and work on this next week |
@lhoestq Here is the list of issues I collected (I believe relevant) on daily CI. Could you take a look please 🙏 . For some tests, maybe the dataset order is changed or the format is changed? single
tests/models/beit/test_modeling_beit.py::BeitModelIntegrationTest::test_inference_semantic_segmentation
(line 508) KeyError: 'file'
--------------------------------------------------------------------------------
single
tests/models/beit/test_modeling_beit.py::BeitModelIntegrationTest::test_post_processing_semantic_segmentation
(line 551) KeyError: 'file'
--------------------------------------------------------------------------------
single
tests/models/granite_speech/test_modeling_granite_speech.py::GraniteSpeechForConditionalGenerationIntegrationTest::test_small_model_integration_test_batch
(line 675) AssertionError: Lists differ: ["sys[512 chars]r is mister quilter's manner less interesting than his matter"] != ["sys[512 chars]r is mister quilp's manner less interesting than his matter"]
--------------------------------------------------------------------------------
single
tests/models/layoutlmv2/test_processor_layoutlmv2.py::LayoutLMv2ProcessorTest::test_overflowing_tokens
(line 3505) FileNotFoundError: [Errno 2] No such file or directory: '/Users/quentinlhoest/.cache/huggingface/datasets/downloads/extracted/35aecbb2e1ba08d57652cba29ac16f2f4b257260d21662922d6eb772ff4b9be1/dataset/training_data/images/0000971160.png'
--------------------------------------------------------------------------------
single
tests/models/layoutlmv2/test_processor_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_2
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutlmv2/test_processor_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_3
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutlmv3/test_processor_layoutlmv3.py::LayoutLMv3ProcessorIntegrationTests::test_processor_case_1
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutlmv3/test_processor_layoutlmv3.py::LayoutLMv3ProcessorIntegrationTests::test_processor_case_2
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutlmv3/test_processor_layoutlmv3.py::LayoutLMv3ProcessorIntegrationTests::test_processor_case_3
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutlmv3/test_processor_layoutlmv3.py::LayoutLMv3ProcessorIntegrationTests::test_processor_case_4
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutlmv3/test_processor_layoutlmv3.py::LayoutLMv3ProcessorIntegrationTests::test_processor_case_5
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutxlm/test_processor_layoutxlm.py::LayoutXLMProcessorTest::test_overflowing_tokens
(line 3505) FileNotFoundError: [Errno 2] No such file or directory: '/Users/quentinlhoest/.cache/huggingface/datasets/downloads/extracted/35aecbb2e1ba08d57652cba29ac16f2f4b257260d21662922d6eb772ff4b9be1/dataset/training_data/images/0000971160.png'
--------------------------------------------------------------------------------
single
tests/models/layoutxlm/test_processor_layoutxlm.py::LayoutXLMProcessorIntegrationTests::test_processor_case_1
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutxlm/test_processor_layoutxlm.py::LayoutXLMProcessorIntegrationTests::test_processor_case_2
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutxlm/test_processor_layoutxlm.py::LayoutXLMProcessorIntegrationTests::test_processor_case_3
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutxlm/test_processor_layoutxlm.py::LayoutXLMProcessorIntegrationTests::test_processor_case_4
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/layoutxlm/test_processor_layoutxlm.py::LayoutXLMProcessorIntegrationTests::test_processor_case_5
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/udop/test_modeling_udop.py::UdopModelIntegrationTests::test_conditional_generation
(line 420) huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-6855b86a-62ada26a5deba74073ee7878;ac013a8a-3c75-405f-aec4-c0e3a8c22767)
--------------------------------------------------------------------------------
single
tests/models/udop/test_processor_udop.py::UdopProcessorTest::test_overflowing_tokens
(line 3505) FileNotFoundError: [Errno 2] No such file or directory: '/Users/quentinlhoest/.cache/huggingface/datasets/downloads/extracted/35aecbb2e1ba08d57652cba29ac16f2f4b257260d21662922d6eb772ff4b9be1/dataset/training_data/images/0000971160.png'
--------------------------------------------------------------------------------
single
tests/models/udop/test_processor_udop.py::UdopProcessorIntegrationTests::test_processor_case_1
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/udop/test_processor_udop.py::UdopProcessorIntegrationTests::test_processor_case_2
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/udop/test_processor_udop.py::UdopProcessorIntegrationTests::test_processor_case_3
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/udop/test_processor_udop.py::UdopProcessorIntegrationTests::test_processor_case_4
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/udop/test_processor_udop.py::UdopProcessorIntegrationTests::test_processor_case_5
(line 319) ValueError: Unsupported number of image dimensions: 2
--------------------------------------------------------------------------------
single
tests/models/upernet/test_modeling_upernet.py::UperNetModelIntegrationTest::test_inference_convnext_backbone
(line 420) huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-6855b8d7-1b8bdf0c2a7a598823cdce04;f28ca804-7b6e-41f0-bcb5-78a25849d239)
--------------------------------------------------------------------------------
single
tests/models/upernet/test_modeling_upernet.py::UperNetModelIntegrationTest::test_inference_swin_backbone
(line 420) huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-6855b8d9-6c0cbe2e2b685a0f63b8f2f1;d88ac8fe-e7e5-42ea-ae76-87b382f45534)
--------------------------------------------------------------------------------
single
tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::TrOCRModelIntegrationTest::test_inference_handwritten
(line 1171) AssertionError: Tensor-likes are not close!
--------------------------------------------------------------------------------
single
tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::TrOCRModelIntegrationTest::test_inference_printed
(line 437) ValueError: Unknown split "test". Should be one of ['train'].
--------------------------------------------------------------------------------
single
tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_batched_generation_multilingual
(line 675) AssertionError: Lists differ: ['夏の時期の時期でした', ' It was the time of day and[37 chars]er.'] != ['木村さんに電話を貸してもらいました', ' Kimura-san called me.']
--------------------------------------------------------------------------------
single
tests/pipelines/test_pipelines_image_segmentation.py::ImageSegmentationPipelineTests::test_maskformer
(line 605) KeyError: 'file'
--------------------------------------------------------------------------------
single
tests/pipelines/test_pipelines_image_segmentation.py::ImageSegmentationPipelineTests::test_oneformer
(line 659) KeyError: 'file'
--------------------------------------------------------------------------------
single
tests/trainer/test_trainer_utils.py::TrainerUtilsTest::test_label_smoothing
(line 529) ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)
-------------------------------------------------------------------------------- |
Alright I fixed them all except |
EXPECTED_TRANSCRIPTS = ["木村さんに電話を貸してもらいました", " Kimura-san called me."] | ||
EXPECTED_TRANSCRIPTS = [ | ||
"夏の時期の時期でした", | ||
" It was the time of day and all of the pens left during the summer.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this with samples from the same dataset (but for some reason couldn't get the same sample here)
Anyway FYI the ground truth is "It was the time of day when all of Spain slept during the summer.", from here: https://huggingface.co/datasets/hf-internal-testing/fixtures_common_voice/viewer/ja/test?row=0&views%5B%5D=test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for inputs, we get different samples than before, so we need to update the expected outputs, right?
If so, good for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, EXPECTED_TRANSCRIPTS that is updated here is the updated outputs :)
the input update is a few lines above, in load _dataset
(just fixed the 2 remaining ones in the CI) |
Hi, I am running within a T4 CI runners, and get the following 9 failures. Let me know if you have any doubts about them.
|
this one is passing now. (so likely irrelevant to your changes indeed) |
For
it uses
and I think you don't touch that one neither. So OK, this one is irrelevant. |
I fixed all the ones you mentioned, including |
Thank you a lot for the improvement, also the patience for the tests. On T4, all passing (after my last commit). I will update 2 (or 4) expected outputs for A10 tomorrow then merge. |
...and remove the tests that were skipped in #38931