Skip to content

(chore): derive CI matrix from hatch env #3607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

flying-sheep
Copy link
Member

@flying-sheep flying-sheep commented Apr 22, 2025

See https://github.com/scverse/cookiecutter-scverse/blob/d4bdfc5ec4c5029aebf7c0cba65609e20358144d/%7B%7Bcookiecutter.project_name%7D%7D/.github/workflows/test.yaml

This also updates the hatch env matrix to have the following jobs

  • stable (formerly full): run all pretty much tests except external tools
  • pre (unchanged): run tests with pre-release versions
  • min-vers (formerly min): run tests with the minimum possible versions
  • min-deps (new): like in anndata, tests if everything imports OK with no optional dependencies installed

The only regression is that we no longer test scanorama, because it depends on annoy, which hasn’t been updated in years and fails to build on my system.

We should probably think of a better strategy for it, but I think running the maximum set of tests that we test online also locally is a good idea.

We also need to check if codecov still works, I’m typing this while tests still run.

Copy link

codecov bot commented Apr 24, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
2141 1 2140 97
View the top 1 failed test(s) by shortest run time
tests/test_highly_variable_genes.py::test_compare_to_upstream[dask_array_sparse-seurat-fgd]
Stack Traces | 0.111s run time
#x1B[0m#x1B[37m@pytest#x1B[39;49;00m.mark.parametrize(#x1B[33m"#x1B[39;49;00m#x1B[33mfunc#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, [#x1B[33m"#x1B[39;49;00m#x1B[33mhvg#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, #x1B[33m"#x1B[39;49;00m#x1B[33mfgd#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m])#x1B[90m#x1B[39;49;00m
    #x1B[37m@pytest#x1B[39;49;00m.mark.parametrize(#x1B[90m#x1B[39;49;00m
        (#x1B[33m"#x1B[39;49;00m#x1B[33mflavor#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, #x1B[33m"#x1B[39;49;00m#x1B[33mparams#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, #x1B[33m"#x1B[39;49;00m#x1B[33mref_path#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m),#x1B[90m#x1B[39;49;00m
        [#x1B[90m#x1B[39;49;00m
            pytest.param(#x1B[90m#x1B[39;49;00m
                #x1B[33m"#x1B[39;49;00m#x1B[33mseurat#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, #x1B[96mdict#x1B[39;49;00m(min_mean=#x1B[94m0.0125#x1B[39;49;00m, max_mean=#x1B[94m3#x1B[39;49;00m, min_disp=#x1B[94m0.5#x1B[39;49;00m), FILE, #x1B[96mid#x1B[39;49;00m=#x1B[33m"#x1B[39;49;00m#x1B[33mseurat#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
            ),#x1B[90m#x1B[39;49;00m
            pytest.param(#x1B[90m#x1B[39;49;00m
                #x1B[33m"#x1B[39;49;00m#x1B[33mcell_ranger#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, #x1B[96mdict#x1B[39;49;00m(n_top_genes=#x1B[94m100#x1B[39;49;00m), FILE_CELL_RANGER, #x1B[96mid#x1B[39;49;00m=#x1B[33m"#x1B[39;49;00m#x1B[33mcell_ranger#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
            ),#x1B[90m#x1B[39;49;00m
        ],#x1B[90m#x1B[39;49;00m
    )#x1B[90m#x1B[39;49;00m
    #x1B[37m@pytest#x1B[39;49;00m.mark.parametrize(#x1B[33m"#x1B[39;49;00m#x1B[33marray_type#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, ARRAY_TYPES)#x1B[90m#x1B[39;49;00m
    #x1B[94mdef#x1B[39;49;00m#x1B[90m #x1B[39;49;00m#x1B[92mtest_compare_to_upstream#x1B[39;49;00m(#x1B[90m#x1B[39;49;00m
        *,#x1B[90m#x1B[39;49;00m
        request: pytest.FixtureRequest,#x1B[90m#x1B[39;49;00m
        func: Literal[#x1B[33m"#x1B[39;49;00m#x1B[33mhvg#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, #x1B[33m"#x1B[39;49;00m#x1B[33mfgd#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m],#x1B[90m#x1B[39;49;00m
        flavor: Literal[#x1B[33m"#x1B[39;49;00m#x1B[33mseurat#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, #x1B[33m"#x1B[39;49;00m#x1B[33mcell_ranger#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m],#x1B[90m#x1B[39;49;00m
        params: #x1B[96mdict#x1B[39;49;00m[#x1B[96mstr#x1B[39;49;00m, #x1B[96mfloat#x1B[39;49;00m | #x1B[96mint#x1B[39;49;00m],#x1B[90m#x1B[39;49;00m
        ref_path: Path,#x1B[90m#x1B[39;49;00m
        array_type: Callable,#x1B[90m#x1B[39;49;00m
    ):#x1B[90m#x1B[39;49;00m
        #x1B[94mif#x1B[39;49;00m func == #x1B[33m"#x1B[39;49;00m#x1B[33mfgd#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m #x1B[95mand#x1B[39;49;00m flavor == #x1B[33m"#x1B[39;49;00m#x1B[33mcell_ranger#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            reason = #x1B[33m"#x1B[39;49;00m#x1B[33mThe deprecated filter_genes_dispersion behaves differently with cell_ranger#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
            request.applymarker(pytest.mark.xfail(reason=reason))#x1B[90m#x1B[39;49;00m
        hvg_info = pd.read_csv(ref_path)#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
        pbmc = pbmc68k_reduced()#x1B[90m#x1B[39;49;00m
        pbmc.X = pbmc.raw.X#x1B[90m#x1B[39;49;00m
        pbmc.X = array_type(pbmc.X)#x1B[90m#x1B[39;49;00m
        pbmc.var_names_make_unique()#x1B[90m#x1B[39;49;00m
        sc.pp.filter_cells(pbmc, min_counts=#x1B[94m1#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
        sc.pp.normalize_total(pbmc, target_sum=#x1B[94m1e4#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
        #x1B[94mif#x1B[39;49;00m func == #x1B[33m"#x1B[39;49;00m#x1B[33mhvg#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            sc.pp.log1p(pbmc)#x1B[90m#x1B[39;49;00m
            sc.pp.highly_variable_genes(pbmc, flavor=flavor, **params, inplace=#x1B[94mTrue#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
        #x1B[94melif#x1B[39;49;00m func == #x1B[33m"#x1B[39;49;00m#x1B[33mfgd#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            sc.pp.filter_genes_dispersion(#x1B[90m#x1B[39;49;00m
                pbmc, flavor=flavor, **params, log=#x1B[94mTrue#x1B[39;49;00m, subset=#x1B[94mFalse#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
            )#x1B[90m#x1B[39;49;00m
        #x1B[94melse#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            #x1B[94mraise#x1B[39;49;00m #x1B[96mAssertionError#x1B[39;49;00m()#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
        np.testing.assert_array_equal(#x1B[90m#x1B[39;49;00m
            hvg_info[#x1B[33m"#x1B[39;49;00m#x1B[33mhighly_variable#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m], pbmc.var[#x1B[33m"#x1B[39;49;00m#x1B[33mhighly_variable#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m]#x1B[90m#x1B[39;49;00m
        )#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
        #x1B[90m# (still) Not equal to tolerance rtol=2e-05, atol=2e-05#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
        #x1B[90m# np.testing.assert_allclose(4, 3.9999, rtol=2e-05, atol=2e-05)#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
>       np.testing.assert_allclose(#x1B[90m#x1B[39;49;00m
            hvg_info[#x1B[33m"#x1B[39;49;00m#x1B[33mmeans#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m],#x1B[90m#x1B[39;49;00m
            pbmc.var[#x1B[33m"#x1B[39;49;00m#x1B[33mmeans#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m],#x1B[90m#x1B[39;49;00m
            rtol=#x1B[94m2e-05#x1B[39;49;00m,#x1B[90m#x1B[39;49;00m
            atol=#x1B[94m2e-05#x1B[39;49;00m,#x1B[90m#x1B[39;49;00m
        )#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31mE       AssertionError: #x1B[0m
#x1B[1m#x1B[31mE       Not equal to tolerance rtol=2e-05, atol=2e-05#x1B[0m
#x1B[1m#x1B[31mE       #x1B[0m
#x1B[1m#x1B[31mE       Mismatched elements: 10 / 765 (1.31%)#x1B[0m
#x1B[1m#x1B[31mE       Max absolute difference among violations: 0.01#x1B[0m
#x1B[1m#x1B[31mE       Max relative difference among violations: 0.01#x1B[0m
#x1B[1m#x1B[31mE        ACTUAL: array([1.88077 , 0.862207, 2.789555, 3.51563 , 0.779662, 2.495507,#x1B[0m
#x1B[1m#x1B[31mE              2.051983, 2.200089, 2.121406, 2.272991, 2.340023, 2.030907,#x1B[0m
#x1B[1m#x1B[31mE              2.267801, 1.565244, 2.465999, 0.676449, 2.246484, 3.938324,...#x1B[0m
#x1B[1m#x1B[31mE        DESIRED: array([1.88077 , 0.862207, 2.789555, 3.515629, 0.779662, 2.495507,#x1B[0m
#x1B[1m#x1B[31mE              2.051983, 2.200089, 2.121406, 2.272991, 2.340023, 2.030907,#x1B[0m
#x1B[1m#x1B[31mE              2.267801, 1.565244, 2.465999, 0.676449, 2.246485, 3.938324,...#x1B[0m

#x1B[1m#x1B[31mtests/test_highly_variable_genes.py#x1B[0m:406: AssertionError

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@flying-sheep flying-sheep self-assigned this May 15, 2025
@flying-sheep flying-sheep added this to the 1.11.2 milestone May 16, 2025
@flying-sheep flying-sheep marked this pull request as ready for review May 16, 2025 15:15
@flying-sheep flying-sheep requested a review from ilan-gold May 16, 2025 15:15
Copy link
Contributor

@ilan-gold ilan-gold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments/questions


[[envs.hatch-test.matrix]]
deps = [ "stable", "full", "pre", "min" ]
deps = [ "stable", "pre", "min-vers", "min-extras" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be changed to minimal-dependencies because if I am not mistaken, that's what it's testing for. Right now it reads like "minimum version of the extras" but that's actually what min-vers is for.

harmony = [ "harmonypy" ] # Harmony dataset integration
scanorama = [ "scanorama" ] # Scanorama dataset integration
scrublet = [ "scikit-image>=0.20" ] # Doublet detection with automatic thresholds
louvain = [ "igraph", "louvain>=0.8.2" ] # Louvain community detection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we allowed to bump minimum versions like this on the minor release?

Copy link
Member Author

@flying-sheep flying-sheep May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We’re not, this is the minimum possible version that actually works. Anyone who tried running scanpy with e.g. louvain 0.8.0 would have run into an error. Also bumping patch versions in a patch version is fine, no? Just fixes.

This PR changes the min-deps script so it doesn’t try to do fuzzy matching of the “patch” version. (see below on why I’m using scare quotes). We added a .* to the end of the version, i.e. we had an actually functional version of what Ilia Kats tried in the anndata PR. His version didn’t work, because it used ~= together with adding a .0 but it didn’t guard against doing that multiple times, therefore e.g. making thing>=1.5 into thing~=1.5.0.0.0.0 or so, which just means thing==1.5 (~= in Python is not a semver-aware operator like in JS, it just transforms =~stuff.x into >=stuff.(x+1), <(stuff+1) no matter how many dots stuff has)

I was always against this fuzzy matching: it meant

  1. that we can’t specify the actual minimum version for which we wanted (as said, louvain>=0.8 is a lie, only >=0.8.2 is functional)
  2. we’re adding special magical semantics based on the number of dots to our version boundaries which one would need to know to meaningfully change the boundaries

The only way fuzzy matching could work is if we maintained a list of dependencies that we know actually use semver and use it. But if we’d do that in min-vers.py, we wouldn’t actually test if our minimum version bounds are actually meaningful, so I don’t like that either.

The perfect solution would be:

  1. what I do in this PR
  2. we add a dependabot or so config that auto-bumps just the patch versions of known semver dependencies regularly

That’d mean our minimum version bounds are actually meaningful, but we’d still make sure to use the latest patches

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyone who tried running scanpy with e.g. louvain 0.8.0 would have run into an error. Also bumping patch versions in a patch version is fine, no? Just fixes.

In this case, it's a minor version but if this is a no-change due to the non-functioning status of louvain, then we're fine. So since it was a minor version change (previously 0.6.0), why was that not working? Was not properly tested because of fuzzy matching? I don't think the two are related, so would be good to undersand here.

therefore e.g. making thing>=1.5 into thing~=1.5.0.0.0.0 or so, which just means thing==1.5

I noticed this as well once you brought up the potential pitfalls of what he did, thanks for that.

we add a dependabot

Neutral on this, but would support if we add some sort of auto-merge strategy and/or automatically merge if tests pass. With JS projects I'm always hesitant to merge given the complexity of literal user-facing graphical applications (i.e., button size changes or something).

"scanpy[magic]",
"scanpy[skmisc]",
"scanpy[harmony]",
"scanpy[scanorama]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the description of this PR:

The only regression is that we no longer test scanorama, because it depends on annoy, which hasn’t been updated in years and fails to build on my system.

Presumably this is coming from spotify/annoy#680. Can we put a condition on this for 3.13? Separately, I would be curious how this has been working the past few months at all. I semi-recently downloaded the package no issue. On first glance, for example, https://github.com/scverse/scanpy/actions/runs/15063089888/job/42341857082 has annoy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying I actually know what is going on on your system but it seems like perhaps we could use https://packaging.python.org/en/latest/specifications/dependency-specifiers/#dependency-specifiers as a middle ground between completely removing and leaving something in that breaks your system.

Copy link
Member Author

@flying-sheep flying-sheep May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should fix this before this PR gets merged.

scanorama→annoy is the only dependency that requires building a native package on my system, I think, and annoy doesn’t seem to be very actively maintained, that’s why I thought we should evaluate this a little.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants