Skip to content

Add architectures where the transformer is used only for embeddings #163

Draft
lfoppiano wants to merge 26 commits intomasterfrom
bert-bidlstm-models
Draft

Add architectures where the transformer is used only for embeddings #163
lfoppiano wants to merge 26 commits intomasterfrom
bert-bidlstm-models

Conversation

@lfoppiano
Copy link
Collaborator

@lfoppiano lfoppiano commented Aug 21, 2023

This PR is still work in progress

The goal of this PR is add additional architectures where the Transformer layer is frozen and used only to generate embedding representation.
At the moment the last 4 layers are concatenated together and passed to a Bidirectional LSTM with a dense layer on top. Once the implementation is confirmed, we can experiment with different options.

Currently:

  • when enabling the character channel, the ChainCRF architecture does not work
  • the results are much lower than expected and there are some behaviours that need a review

@lfoppiano lfoppiano requested review from kermitt2 and pjox August 21, 2023 05:26
@lfoppiano lfoppiano added the enhancement New feature or request label Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant