-
|
Hi, thanks @mrdbourke for the course. I had no idea what deep learning even was before it. Now I'm feeling like I could code skynet. So according to my understanding the CNN that is used to turn the images into patch embeddings in the paper replicating chapter returns multiple feature maps of the same image. If that's the case why do we add the position embeddings when all of the feature maps are of the same image? Why is the position of these feature maps in the tensor important? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi @hasorez, Glad you're enjoying the course! You're right about the CNN turning images into patch embeddings. The position embeddings are added so that the model knows that the patches have some kind of sequential order. As in, patch 1 will be somewhat more related to patch 2 than to patch 16 (in most cases). Otherwise, the model may just assign random order to the patches (it would still likely learn something here because neural networks are pretty robust). From a high level, this in turn means that a model would see an image as a collection of random patches of colour, not really knowing that a dogs head is most often connected to the rest of its body (the patch with the dog head is close to the patch of the body). The authors of the paper found adding positional embeddings improved performance. Source: https://arxiv.org/abs/2010.11929 |
Beta Was this translation helpful? Give feedback.

Hi @hasorez,
Glad you're enjoying the course!
You're right about the CNN turning images into patch embeddings.
The position embeddings are added so that the model knows that the patches have some kind of sequential order.
As in, patch 1 will be somewhat more related to patch 2 than to patch 16 (in most cases).
Otherwise, the model may just assign random order to the patches (it would still likely learn something here because neural networks are pretty robust).
From a high level, this in turn means that a model would see an image as a collection of random patches of colour, not really knowing that a dogs head is most often connected to the rest of its body (the patch with the dog head is c…