08 PyTorch Paper Replicating - question about the visualization of flattened feature map #283

dzy212 · 2023-01-29T10:02:30Z

dzy212
Jan 29, 2023

Hi,

In section 4.4 Flattening the patch embedding with torch.nn.Flatten()

The flattened feature map that was visualized was created by indexing on the embedding dimension:
single_flattened_feature_map = image_out_of_conv_flattened_reshaped[ : , : , 0]

Which results in the shape: (1, 196)

I thought that a single feature map should have had a dimension of (1, 768), as each patch is flattened to P x P x C, and we have 196 of them.

Is this correct? Is what is being visualized a single pixel across all the patches?
Many thanks

mrdbourke · 2023-02-01T06:32:56Z

mrdbourke
Feb 1, 2023
Maintainer

Hi @dzy212,

Each patch is embedded into an embedding size of 768.

The actual patches of the image start as 14x14 but are then flattened to 196 -> embedded to 768.

My interpretation is the feature map they talk about is the features from the image patches (rather than the embedding itself).

I'm getting this from green box below, which is taken from the paper:

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

08 PyTorch Paper Replicating - question about the visualization of flattened feature map #283

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

08 PyTorch Paper Replicating - question about the visualization of flattened feature map #283

Uh oh!

Uh oh!

dzy212 Jan 29, 2023

Replies: 1 comment

Uh oh!

mrdbourke Feb 1, 2023 Maintainer

dzy212
Jan 29, 2023

mrdbourke
Feb 1, 2023
Maintainer