Skip to content

Conversation

@molbap
Copy link
Contributor

@molbap molbap commented Oct 29, 2025

What does this PR do?

Should fix #41929 . The check_model_inputs / can_record_outputs interaction is not always trivial and models with several entrypoints such as VisionModel vs VisionTransformer are missing some, adding it here. Also added a modification in generic to make sure the flag was captured, not 100% sure it's needed.

@molbap molbap requested a review from vasqu October 29, 2025 09:51
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I would just revert the registry default change. Shouldnt be needed and I don't wanna risk breaking executorch (or similar trace ops)


# _can_record_outputs is None by default
capture_flags = _CAN_RECORD_REGISTRY.get(str(self.__class__)) or {} # there is a weak ref for executorch
capture_flags = _CAN_RECORD_REGISTRY.get(str(self.__class__)) or getattr(self, "_can_record_outputs", {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can revert this, the registry should already be either applied to the class or we get the default {}. See

_CAN_RECORD_REGISTRY[str(self.__class__)] = self._can_record_outputs # added for executorch support only

And I honestly don't wanna risk breaking executorch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the latter reason is compelling 😬

return self.vision_model.embeddings.patch_embedding

@check_model_inputs(tie_last_hidden_states=False)
@auto_docstring
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Ig this was already inherited?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, interesting. yes it should have been, entirely unrelated to this lol

@molbap
Copy link
Contributor Author

molbap commented Oct 29, 2025

run-slow: siglip, siglip2

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/siglip', 'models/siglip2']
quantizations: [] ...

@molbap
Copy link
Contributor Author

molbap commented Oct 29, 2025

run-slow: siglip, siglip2

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/siglip', 'models/siglip2']
quantizations: [] ...

@molbap
Copy link
Contributor Author

molbap commented Oct 29, 2025

ended up down the rabbit hole of wrong PreTrainedModel inheritances hehe, @yonigozlan if you want to take a look

@molbap
Copy link
Contributor Author

molbap commented Oct 30, 2025

run-slow: siglip, siglip2

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/siglip', 'models/siglip2']
quantizations: [] ...

@molbap
Copy link
Contributor Author

molbap commented Oct 30, 2025

run-slow: siglip, siglip2

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/siglip', 'models/siglip2']
quantizations: [] ...

@molbap molbap requested a review from ArthurZucker October 30, 2025 14:11
@molbap
Copy link
Contributor Author

molbap commented Oct 31, 2025

run-slow: siglip, siglip2

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/siglip', 'models/siglip2']
quantizations: [] ...

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: siglip, siglip2

@molbap
Copy link
Contributor Author

molbap commented Oct 31, 2025

run-slow: siglip, siglip2

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/siglip', 'models/siglip2']
quantizations: [] ...

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awkward situation with the auto model, can you also check out clip? We should face very similar issues there as well + do we need to adjust tests maybe?

Comment on lines +169 to +171
@unittest.skip(reason="This test is broken on A10 multi runners for now")
def test_multi_gpu_data_parallel_forward(self):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't skip these, will be hard to revert because everyone will forget imo

nn.init.normal_(module.fc1.bias, std=1e-6)
nn.init.normal_(module.fc2.bias, std=1e-6)
elif isinstance(module, SiglipMultiheadAttentionPoolingHead):
elif "MultiheadAttentionPoolingHead" in module.__class__.__name__:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unrelated, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ViT model's output_attention is not work

4 participants