-
Notifications
You must be signed in to change notification settings - Fork 1
feat(sq): algorithm optimization #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Multiple implementations added (faiss, numpy, numpy+memmap) Nearest centroid search optimized Refactored StochasticQuantization class Added parallel execution support Removed required history log (used only when turned on) Updated verbosity to support levels Integrated tqdm for progress tracking Model export/saving and loading adde
0984426
to
5c0db3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution! Here are some preliminary fixed, including dependency management and polyfills for built-in functions. Also make sure the unit tests stay up to date
@@ -47,3 +48,7 @@ classifiers = [ | |||
Homepage = "https://github.com/kaydotdev/stochastic-quantization" | |||
Issues = "https://github.com/kaydotdev/stochastic-quantization/issues" | |||
Repository = "https://github.com/kaydotdev/stochastic-quantization.git" | |||
|
|||
[project.optional-dependencies] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to group optional dependencies, something like this:
[project.optional-dependencies]
faiss-cpu = ["faiss-cpu>=1.10.0,<2"]
faiss-gpu = ["faiss-gpu>=1.10.0,<2"]
progress = ["tqdm>=4.66.0,<5"]
all = ["sqg[faiss-cpu,faiss-gpu,progress]"]
code/pyproject.toml
Outdated
dependencies = [ | ||
"numpy>=1.26.4,<2", | ||
"scikit-learn>=1.5.1,<2", | ||
"tqdm>=4.66.0,<5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inclusion of tqdm
would violate the references in the original paper, it's better to include it as an optional dependency
code/setup.py
Outdated
@@ -45,5 +44,10 @@ | |||
install_requires=[ | |||
"numpy>=1.26.4,<2", | |||
"scikit-learn>=1.5.1,<2", | |||
"tqdm>=4.66.0,<5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we need to move it to the optional dependencies
import contextlib | ||
|
||
import joblib | ||
from tqdm.autonotebook import tqdm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move tqdm
to optional dependencies, we need to check the package with ImportError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
from itertools import islice | ||
|
||
|
||
def batched_iterable(iterable, batch_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to replace this implementation with the polyfill, something like this:
from sys import version_info
import itertools
if version_info >= (3, 12) and hasattr(itertools, "batched"):
# Built-in since 3.12 (returns tuples)
batched = itertools.batched # type: ignore[attr-defined]
else:
def batched(iterable, n, *, strict=False):
"""Back-port of itertools.batched for Py < 3.12 (returns tuples)."""
if n < 1:
raise ValueError("n must be >= 1")
it = iter(iterable)
while (chunk := tuple(itertools.islice(it, n))):
if strict and len(chunk) != n:
raise ValueError("last batch smaller than n")
yield chunk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applied
Changes:
TODO: