-
Notifications
You must be signed in to change notification settings - Fork 638
Draft PR: Add modularity and modularity_adata functions to scanpy.metrics #3613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Apart from the issue with igraph
being an optional dependency, this looks good!
I think we have get_igraph_from_adjacency
which might be useful, but maybe not.
Could you please add tests? We have many examples on how, best would probably be
- for the direct variant, manually create very small graphs to run this on so you can be sure the results are correct
- for the anndata version, use
neighbors
to create the connectivity matrix.
Please add @needs.igraph
so the test only runs when igraph is installed
if you’re unsure about anything, please search the code for examples or ask me!
If you end up implementing a non-igraph flavor for this, please test using parametrization, e.g.: @pytest.mark.parametrize("directed", [True, False], ids=["directed", "undirected"])
… on how to integrate the two
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, this is shaping up! The tests are looking particularly great!
Please address these previous comments:
- Please follow the doc style as described here: Draft PR: Add modularity and modularity_adata functions to scanpy.metrics #3613 (comment)
Also I mentioned in there “But here a is_directed
: bool parameter would be better anyway, there will never be more than two options.”
What do you think?
- Don’t densify (in
modularity_adata
or anywhere): Draft PR: Add modularity and modularity_adata functions to scanpy.metrics #3613 (comment) - Why the
np.asarray
? Draft PR: Add modularity and modularity_adata functions to scanpy.metrics #3613 (comment)
Oh, I just saw that we have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A few points:
- please add a release note (see the other PR, please adapt the PR number in the command I mention there when running it for this PR)
- please move the correct change to
unique_labels
(to_numpy
) from the other PR into this one:scanpy/src/scanpy/metrics/_metrics.py
Line 63 in 040b8b7
unique_labels = pd.unique(np.concatenate((orig.to_numpy(), new.to_numpy()))) - please change
mode
tois_directed: bool
(no default value) formodularity
Once your other PR (the neighbors
one) is merged, let’s finish this one up:
- we should make
neighbors
store anis_directed
param in.uns[key_added or "neighbors"]["params"]
- then we change the
modularity_adata
parameterobsp: str
parameter tokey: str = "neighbors"
- then we use
adata.uns[key]["connectivities_key"]
andadata.uns[key]["params"]["is_directed"]
inmodularity_adata
to callmodularity
.
This adds two functions to compute modularity scores from a given graph and a clustering like Leiden or Louvain. The goal is to make it easier to compare different community detection methods using an external metric. This follows up on issue #2908. To my knowledge, there is no built-in way to compare clustering results nor ways to calculate modularity score.
Functions: