-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RecSplit MPHF mapping #2
Comments
You're right. We have working tests, so my guess is that serializing such a small map triggers some bug in the serialization process: if you build the map in the code, it works, and if you try, with, like, 200 strings it works, too. The problem is with serializing a very small number of keys. I suspect some off-by-one. Thanks for the bug report. The workaround for the time being is just using more keys :). |
Sorry—it's bullshit. Much easier. The RecSplit constructor which takes a file pointer (and which the dump tool uses) was implemented erroneously using the C getline(), which leaves the delimiter (e.g., I'll fix this ASAP. In all our experiments we use dump128 and load128, so nobody every noticed this. |
Fixed in 0dc7222. |
Thank you |
I am trying to get the mapping from keys to unique indices out of a RecSplit MPHF. I created a file with 4 strings and passed it to
recsplit_dump_8
, creating an MPHF. I modifiedrecsplit_load.cpp
(shown below) to display the mapping. However, the mapping is not a bijection. I also tried with a million keys and could not get an MPHF.I've only been working with the tool for a few hours today, but I can't see how I'm using the interface incorrectly. Any help would be appreciated.
Here is the output of my run:
Here is my modification to
recsplit_load.cpp
to print the mapping.The text was updated successfully, but these errors were encountered: