Skip to content

Have the training been implemented (or attempted) on **ratio 8** tokenizer? #85

@martian422

Description

@martian422

The rFID of ratio 8 llamagen tokenizer looks good, I'm wondering if the author or anyone else has tried it to train a generation model with 1024 tokens.
Auto-regressive or discrete diffusion,anyway, cuz this seems to have an opportunity to surpass continuous diffusion models.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions