Skip to content

sql/inspect: speed up index consistency check with a hash #150927

@spilchen

Description

@spilchen

We’re looking for index inconsistencies by comparing rows from the primary index and a secondary index. The current approach is to run a full outer join between the two index access plans and return rows that exist on only one side. This works, but is expensive on large tables.

An alternative is to compute a hash as a fast first-pass check. By computing a hash over the rows accessed via each index (either across the whole table or in smaller chunks), we can detect whether a difference exists without doing a full join. If the hashes match, the indexes are likely consistent; if not, we can fall back to a detailed comparison on the mismatched chunk.

This issue is opened to add this hash pre-check. Specifically:

  • Use hash over key ranges to detect mismatches.
  • Only perform detailed joins where mismatches are detected.

This should improve performance of index verification on large tables. We will want a knob to avoid the hashes all together.

Jira issue: CRDB-53007

Epic CRDB-55075

Metadata

Metadata

Assignees

Labels

A-sql-scrubC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-sql-foundationsSQL Foundations Team (formerly SQL Schema + SQL Sessions)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions