-
Notifications
You must be signed in to change notification settings - Fork 356
Open
Description
Suggestion
Currently, the search string normalization in cmdk/src/command-score.ts only performs lowercasing and space character replacement:
function formatInput(string) {
// convert all valid space characters to space so they match each other
return string.toLowerCase().replace(COUNT_SPACE_REGEXP, ' ')
}This approach does not handle unicode diacritics (e.g., accents in café, naïve, etc.). As a result, searches for "cafe" will not match "café".
Proposal:
Extend search string normalization to remove unicode diacritics using String.prototype.normalize('NFD') and a regex to strip combining marks:
function formatInput(string) {
return string
.toLowerCase()
.normalize('NFD') // Decompose unicode characters
.replace(/[\u0300-\u036f]/g, '') // Remove diacritical marks
.replace(COUNT_SPACE_REGEXP, ' ')
}This change will make search matching more robust for international users and improve search results for text containing diacritics.
Location:
JonasDoesThings and FloChehab
Metadata
Metadata
Assignees
Labels
No labels