Skip to content

Better tokenizer #161

@ptbrowne

Description

@ptbrowne

fromOldClient does not return any result in docs.cozy.io search.
CozyClient.fromOldClient does return the result.

IMHO this is caused by the tokenizer, that considers the point not to be separating two words which causes "fromOldClient" not to be a word.

I think we tweaked the tokenizer because we needed doctypes to be returned as is, that is : the dot should not split doctypes like "io.cozy.bills" but should split CozyClient.fromOldClient.

Since doctypes can be inferred as having at least 2 dots, and no starting capital, we could maybe improve the tokenizer to support both cases.

See

tokenizer: "[^a-z\u0430-\u044F\u04510-9\\-\\.]"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions