Skip to content

Commit

Permalink
Merge pull request #132 from liyz15/eot_after_truncate
Browse files Browse the repository at this point in the history
Change the last token to eot_token truncating
  • Loading branch information
Zasder3 authored Jul 26, 2022
2 parents 515d448 + 39d3a94 commit 46dc933
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions src/open_clip/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ def tokenize(texts: Union[str, List[str]], context_length: int = 77) -> torch.Lo
for i, tokens in enumerate(all_tokens):
if len(tokens) > context_length:
tokens = tokens[:context_length] # Truncate
tokens[-1] = eot_token
result[i, :len(tokens)] = torch.tensor(tokens)

return result

0 comments on commit 46dc933

Please sign in to comment.