-
Notifications
You must be signed in to change notification settings - Fork 77
Initial numba module #3225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Initial numba module #3225
Conversation
I've done quite a bit of |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3225 +/- ##
=======================================
Coverage 89.61% 89.62%
=======================================
Files 28 28
Lines 31895 31901 +6
Branches 5872 5873 +1
=======================================
+ Hits 28583 28591 +8
+ Misses 1884 1881 -3
- Partials 1428 1429 +1
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Having some CI weirdness that I'm not yet able to recreate. |
CI Fixed. Here's some benchmarking with the "coalescent_nodes" method from #2778 on a TS with 12M edges: Using |
Shall we move the first commit into its own PR? It's cluttering up this one and making it hard to see the real changes. |
I had imagined something lower level that was basically a copy of the TreePosition class from here: https://github.com/jeromekelleher/sc2ts/blob/7758245c3dc537aeec3b7cd6282241b65f8843dd/sc2ts/jit.py#L107 So, we don't try to provide Pythonic APIs, but just provide direct access to the edges out and edges in, which can be numba compiled like the example in the sc2ts code. |
That's how this code works, while tree_pos.next():
for j in range(tree_pos.out_range[0], tree_pos.out_range[1]):
e = tree_pos.edge_removal_order[j]
c = edges_child[e]
p = edges_parent[e]
parent[c] = -1
u = p
while u != -1:
num_samples[u] -= num_samples[c]
u = parent[u] becomes for tree_pos in numba_ts.edge_diffs():
for j in range(*tree_pos.edges_out_index_range):
e = numba_ts.indexes_edge_removal_order[j]
c = edges_child[e]
p = edges_parent[e]
parent[c] = -1
u = p
while u != -1:
num_samples[u] -= num_samples[c]
u = parent[u] It is still compiled, and 30% faster (for the coalesent nodes example)! |
Ahh, I didn't spot that sorry. How is it faster then? I do think we should just stick with the TreePosition interface though, because we want to support seeking backwards as well, and ultimately randomly. There's no point in adding a layer for indirection on top of that. |
Mutating numpy arrays to maintain the state involves the following:
Whereas yielding lightweight immuatable objects is much more amenable to numba optimisation. We might be able to get the same gains by using native objects for the state rather than numpy arrays if you are set against iteration. |
Let's talk it through in person - I don't have time to form an educated opinion I'm afraid. |
Part of #3135