You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The query function is not entirely random. It begins by initializing the nearest neighbor matrix with the leaves of the search tree. If fewer than k neighbors are found, it then adds random nodes to ensure sufficient neighbors. Afterward, the nearest neighbors are iteratively refined.
See: https://github.com/brj0/nndescent/blob/main/src/nnd.h#L852-L863
Ah gotcha! So the fix would be to use a separate random number generator per thread, seeded at the beginning of each call to query(). Having the ability to make the output fully deterministic is crucial for scientific applications.
Here's an example run. The non-determinism is only during parallel execution.
Notice how only a small number of rows in
a
andb
are affected, and only columns 8 to the end or 9 to the end.Fabulous library, by the way!
The text was updated successfully, but these errors were encountered: