Skip to content

Feature: index string identities #79

@dennypenta

Description

@dennypenta

Currently the string are indexed only using limited size dictionaries.

Integer values are indexed in data blocks as min/max values.
There is a broad use case to index string identities in a similar manner in order not to rely on bloom filter.
Treating strings as identities allows us to index unlimited cardinality values, but limited values sizes.
e.g. max size of uuid is 36 bytes.

if we can write min/max identity values to the block, not only bloom filter it can reduce bloom miss.
In advance it requires measuring bloom miss metric.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions