Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.

Feature: native support for synonyms in FTS #6000

@hpvd

Description

@hpvd

FTS is already great since advanced features like stopwords and stemming for several languages are available.

What is missing, is the possibility to natively deal with synonyms.
Of course using vector search works in this direction, but its not the same.
Sometimes e.g. for pre-filtering the graph, one prefer relying on exact matches.

typical usecases:

  1. abbreviations:
    CFRP <-> Carbon Fiber Reinforced Plastic

  2. common foreign language usage in special domains
    Lion <-> Panthera leo

  3. name variants:
    Robert <-> Bob

If one would like to directly start with design decisions being prepared for further advancements, there are some things to consider:

  • synonyms may consists of several words (see given examples)
  • may work uni- and bi-directional (syntax driven)
  • do one need a transparent/hinted separation of exact matches (original word) and synonyms?
  • one should care for interacting with wild card pattern matching/advanced_pattern_match Improve fts wild card pattern matching #5998
  • one should care for interacting with stemming
  • there may be more than one synonym for a word/phrase
  • each synonym may have a weight (to be able to boost the ones that are closer to the original concept/more important for current domain)
  • one may have a look at lucene's functionality and syntax
leopard, big cat|0.8, bagheera|0.9, panthera pardus|0.85
lion => panthera leo|0.9, simba leo|0.8, kimba|0.75

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions