Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support finding permutations of -_ in crate names #75

Open
Nemo157 opened this issue Jan 27, 2022 · 4 comments
Open

Support finding permutations of -_ in crate names #75

Nemo157 opened this issue Jan 27, 2022 · 4 comments

Comments

@Nemo157
Copy link
Contributor

Nemo157 commented Jan 27, 2022

It would be useful for tools that want to emulate crates.io's behavior around fuzzy matching of - and _ in crate names if there would be an api like Index::crate_ that would also search for the different permutations if there was no exact match.

For example in cargo-edit: https://github.com/killercup/cargo-edit/blob/c3bb5e5cc089556e4bac2f9b370c4ca0dccb1de6/src/fetch.rs#L123-L160

@kornelski
Copy link
Collaborator

The naive algorithm is exponential. It's not a good solution to query names like that.

It's better to list all crates and make an index with _/- normalized, but the crate_ call is not in position to do that.

So overall I don't think solution to this problem belongs to this crate.

@Nemo157
Copy link
Contributor Author

Nemo157 commented Jan 28, 2022

The indexed lookup is much faster yes, but there are a lot of one-off utilities like cargo-edit that need to lookup a crate from user input and would want to support fuzzy matching; and calculating that index is very slow (a quick benchmark of making a HashMap<Vec<u8>, Crate> index shows that to be ~13s).

Benchmarking using the algorithm cargo-edit uses to generate names, then looking them up serially, I get ~36ms worst case to lookup v1-hello-crates-demo-test-crate-do-not-use (the crate with the most -_ in it, worst case being the correct selection of -_ as the last name); compared to ~166us best case.

I think having this algorithm in one centralized crate is much better than every cargo subcommand copy-pasting it again and again.

@Byron
Copy link
Collaborator

Byron commented Jul 29, 2023

I think it would be feasible to add a new method that has the fuzzy lookup behaviour, and contributions for this would be very welcome.

@Byron
Copy link
Collaborator

Byron commented Jul 30, 2023

What about… adding a permutation iterator? Given a name, it will permute through the possible settings of _ and - and produce a new name on each call to next() until exhaustion. With that, it would be trivial to do something like…

crate_index::Names::new("the-crate-I-look-for").find(|name| index.crate_(name))

This keeps the implementation primitive, which helps when thinking about future abstractions as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants