-
-
Notifications
You must be signed in to change notification settings - Fork 25.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT Removed _safe_accumulator_op
for first-pass algorithm in _assert_all_finite
#23446
MNT Removed _safe_accumulator_op
for first-pass algorithm in _assert_all_finite
#23446
Conversation
_safe_accumulator_op
for first-pass algorithm in _assert_all_finite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM:
- on
main
In [1]: import numpy as np
...: from sklearn.utils.validation import _assert_all_finite
...: a = np.random.RandomState(0).randn(int(1e8)).astype(np.float32)
...: %timeit _assert_all_finite(a)
54.9 ms ± 383 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
- on this branch:
In [1]: import numpy as np
...: from sklearn.utils.validation import _assert_all_finite
...: a = np.random.RandomState(0).randn(int(1e8)).astype(np.float32)
...: %timeit _assert_all_finite(a)
28.4 ms ± 39.8 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Co-authored-by: Olivier Grisel <[email protected]>
Not sure if we should document this in the changelog for 1.2. Maybe we should have something like "Reduce the overhead of finiteness checks for float32 input data by leveraging numpy's SIMD optimized primitives." or something similar. |
I figured this was a small enough change pretty separated from what users really interact with that it would be fine to omit a changelog entry. If you/other reviewers think mentioning the performance gain would be worthwhile I will of course add an entry :) |
I think it's worth adding a changelog entry for performance improvements. |
Added! |
_safe_accumulator_op
for first-pass algorithm in _assert_all_finite
_safe_accumulator_op
for first-pass algorithm in _assert_all_finite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Reference Issues/PRs
Follow-up to #23347
Related to #23197
Specifically addresses #23197 (comment)
What does this implement/fix? Explain your changes.
Removes
_safe_accumulator_op
from_assert_all_finite
since it is not needed in the average case, and can be a significant bottleneck. Even when a false-positive is detected in the rare (and yet-untested) case, the second-pass algorithm will determine it explicitly.Any other comments?
For profiling info refer to: #23197 (comment)