generated gibberish within the block

the generate() function right now only takes the last position of token generated, and shifts the entire window of input one position forward to generate the next output token, still only taking the last position. https://github.com/karpathy/ng-video-lecture/blob/master/gpt.py#L189 

I was curious and looked at the entire output contents. For the input, I fed the output of a previous run with the current generate() function, so the input token sequence would be completely "based on the behavior of the model itself", so to speak.  then I generated the entire list of T tokens from output. to my surprise, the output is very much gibberish , and quite different from the input (though I could still see a few matches). 

I can't figure out why the current method of only taking from the last output position produces seemingly fluent sequences, while the output from middle of the block doesn't make sense. in the current scheme, input grows from torch.zeros((1,1)), up to block size, so during this period, it should be no different from what an output position in the middle of block_size sees, as the output position has masked out all input after it, effective it becomes the end of output window too

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generated gibberish within the block #48

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

generated gibberish within the block #48

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions