Skip to content

Conversation

@simonlevine
Copy link

Hello,

It seems to make more intuitive sense to use 1D convolutions here over the embedding with a channel size equal to the word embedding dimension, rather than the edge-case of a 2D convolution as is currently implemented. I would personally make this change to match other networks with similar convolutions over nn.Embeddings. I believe this has no change to performance but rather is presented for clarity. Thank you

@alihassanijr
Copy link
Collaborator

Hello,

Thank you, yes it would not result in any significant difference, at least that is what we observed in our experiments.
I'll check this PR and make separate models with 1D convs, because I think we'll experience a key mismatch with our old checkpoints if we replace the models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants