Skip to content

Error in utils.get_training_data when clusters_to_remove is not given #35

@krademaker

Description

@krademaker

Hi Alexander,

Running cell2fate on a dataset when no clusters are provided for removal gives the following error:

>>> adata = sc.read_h5ad('.../PEI6404A1.h5ad')
>>> adata=c2f.utils.get_training_data(adata,cells_per_cluster = 10**5,cluster_column = cluster_column,min_shared_counts = 10,n_var_genes= 3000, remove_clusters=None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/env/lib/python3.9/site-packages/cell2fate/utils.py", line 297, in get_training_data
    adata = adata[[c not in remove_clusters for c in adata.obs[cluster_column]], :]
  File "/env/lib/python3.9/site-packages/cell2fate/utils.py", line 297, in <listcomp>
    adata = adata[[c not in remove_clusters for c in adata.obs[cluster_column]], :]
TypeError: argument of type 'NoneType' is not iterable

Which would be because adata = adata[[c not in remove_clusters for c in adata.obs[cluster_column]], :] in utils.get_training_data matches adat.obs[cluster_column] against a None remove_clusters.

Wouldn't it be more sensible either catch this NoneType error when no clusters have to be remove or to set the default remove_clusters in this function to empty list?

>>> clusters_to_remove = []
>>> adata=c2f.utils.get_training_data(adata,cells_per_cluster = 10**5,cluster_column = cluster_column,min_shared_counts = 10,n_var_genes= 3000, remove_clusters = clusters_to_remove)
Keeping at most 100000 cells per cluster
Filtered out 4114 genes that are detected 10 counts (shared).
Extracted 3000 highly variable genes.

works fine for instance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions