Skip to content

search wiktextract etymology_text for reliable patterns #12

@jmviz

Description

@jmviz

Wiktextract etymology_text can be searched in conjunction with the templates with the aim to pick up reliable etymology patterns that may be missed from processing the templates solely. E.g.:

  • from {{}}, chains -> impute der(s)
  • initial From {{}} and {{}} and From {{}} + {{}} -> impute an affix (e.g. druwits)
  • final Ultimately from {{}}. -> impute a der (maybe a root?)
  • final Equivalent to {{}} + {{}}. -> impute a surface analysis, which generally should not be preferred but could be used in case there is/are no valid preceding template(s).

In each case, the template(s) may be any der-like template or m.

In etymology_text the templates are all fully expanded, so this would involve progressively scanning through the text to find the nth etc. template expansion.

Consider also:

Metadata

Metadata

Assignees

No one assigned

    Labels

    processorRelated to the processor, which processes the raw wiktextract data

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions