Skip to content

Initial numba module #3225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Initial numba module #3225

wants to merge 4 commits into from

Conversation

benjeffery
Copy link
Member

Part of #3135

@benjeffery
Copy link
Member Author

I've done quite a bit of numba investigation and found a way to use dataclasses in numba code. This seems to come at very little performance cost compared to tuples and is a lot nicer. Using a generator also seems to work fine!

Copy link

codecov bot commented Jun 18, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.62%. Comparing base (da094c4) to head (54e4964).
Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3225   +/-   ##
=======================================
  Coverage   89.61%   89.62%           
=======================================
  Files          28       28           
  Lines       31895    31901    +6     
  Branches     5872     5873    +1     
=======================================
+ Hits        28583    28591    +8     
+ Misses       1884     1881    -3     
- Partials     1428     1429    +1     
Flag Coverage Δ
c-tests 86.59% <ø> (ø)
lwt-tests 80.38% <ø> (ø)
python-c-tests 88.22% <ø> (+0.04%) ⬆️
python-tests 98.85% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@benjeffery
Copy link
Member Author

Having some CI weirdness that I'm not yet able to recreate.

@benjeffery
Copy link
Member Author

CI Fixed.

Here's some benchmarking with the "coalescent_nodes" method from #2778 on a TS with 12M edges:

Using ts.edge_diffs: 23.3s
Calculating edge diffs and coalescent nodes in a single numba.njit function: 0.085s
Using the classes here, calculating coalescent nodes in separate client numba.njit function: 0.093s

@jeromekelleher
Copy link
Member

Shall we move the first commit into its own PR? It's cluttering up this one and making it hard to see the real changes.

@jeromekelleher
Copy link
Member

I had imagined something lower level that was basically a copy of the TreePosition class from here: https://github.com/jeromekelleher/sc2ts/blob/7758245c3dc537aeec3b7cd6282241b65f8843dd/sc2ts/jit.py#L107

So, we don't try to provide Pythonic APIs, but just provide direct access to the edges out and edges in, which can be numba compiled like the example in the sc2ts code.

@benjeffery
Copy link
Member Author

benjeffery commented Jun 19, 2025

just provide direct access to the edges out and edges in

That's how this code works,
Your sc2ts code here:

    while tree_pos.next():
        for j in range(tree_pos.out_range[0], tree_pos.out_range[1]):
            e = tree_pos.edge_removal_order[j]
            c = edges_child[e]
            p = edges_parent[e]
            parent[c] = -1
            u = p
            while u != -1:
                num_samples[u] -= num_samples[c]
                u = parent[u]

becomes

    for tree_pos in numba_ts.edge_diffs():
        for j in range(*tree_pos.edges_out_index_range):
            e = numba_ts.indexes_edge_removal_order[j]
            c = edges_child[e]
            p = edges_parent[e]
            parent[c] = -1
            u = p
            while u != -1:
                num_samples[u] -= num_samples[c]
                u = parent[u]

It is still compiled, and 30% faster (for the coalesent nodes example)!

@jeromekelleher
Copy link
Member

Ahh, I didn't spot that sorry. How is it faster then?

I do think we should just stick with the TreePosition interface though, because we want to support seeking backwards as well, and ultimately randomly. There's no point in adding a layer for indirection on top of that.

@benjeffery
Copy link
Member Author

How is it faster then?

Mutating numpy arrays to maintain the state involves the following:

  1. Creating a temporary list (build_list).
  2. Performing bounds checks for the slice.
  3. Copying the data from the list into the array's memory.

Whereas yielding lightweight immuatable objects is much more amenable to numba optimisation. We might be able to get the same gains by using native objects for the state rather than numpy arrays if you are set against iteration.

@jeromekelleher
Copy link
Member

Let's talk it through in person - I don't have time to form an educated opinion I'm afraid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants