Skip to content

simulation_single option with pert_data.prepare_split gives ValueErrorΒ #72

@murthy1770

Description

@murthy1770

pert_data.prepare_split(split = 'simulation_single', seed=1) # get data split with seed
pert_data.get_dataloader(batch_size = 32, test_batch_size = 128) # prepare data loader

This gives Value Error. I am using a custom dataset with single gene perturbations only (CROP-Seq). What is the difference between simulation and simulation_single? If you have a dataset with single gene perturbations, what should one use?


ValueError Traceback (most recent call last)
Cell In[6], line 1
----> 1 pert_data.prepare_split(split = 'simulation_single', seed=1) # get data split with seed
2 pert_data.get_dataloader(batch_size = 32, test_batch_size = 128) # prepare data loader

File ~/.conda/envs/biomodels/lib/python3.12/site-packages/gears/pertdata.py:355, in PertData.prepare_split(self, split, seed, train_gene_set_size, combo_seen2_train_frac, combo_single_split_test_set_fraction, test_perts, only_test_set_perts, test_pert_genes, split_dict_path)
351 if split in ['simulation', 'simulation_single']:
352 # simulation split
353 DS = DataSplitter(self.adata, split_type=split)
--> 355 adata, subgroup = DS.split_data(train_gene_set_size = train_gene_set_size,
356 combo_seen2_train_frac = combo_seen2_train_frac,
357 seed=seed,
358 test_perts = test_perts,
359 only_test_set_perts = only_test_set_perts
360 )
361 subgroup_path = split_path[:-4] + '_subgroup.pkl'
362 pickle.dump(subgroup, open(subgroup_path, "wb"))

ValueError: too many values to unpack (expected 2)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions