Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with large system size is hard #13

Open
DesmondYuan opened this issue Feb 10, 2021 · 6 comments
Open

Training with large system size is hard #13

DesmondYuan opened this issue Feb 10, 2021 · 6 comments
Labels
question Further information is requested ready to close

Comments

@DesmondYuan
Copy link
Collaborator

[Essential] system size - data size relationship analysis

  • why

    • question is indeed too hard
    • too many/few time points for each trajectory
    • only config level limit, better fine-tuning can help
  • solution

    • increase sparsity (less params)
    • add more information (longer ntotal)
    • try different network structure
@DesmondYuan DesmondYuan added the question Further information is requested label Feb 10, 2021
@DesmondYuan
Copy link
Collaborator Author

Adding tests for Beeline networks
47e8652 and a2b0c0b

Pratapa, A., Jalihal, A. P., Law, J. N., Bharadwaj, A. & Murali, T. M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods 17, 147–154 (2020). https://www.nature.com/articles/s41592-019-0690-6

@jiweiqi
Copy link
Owner

jiweiqi commented Feb 18, 2021

A good demo for L1

network: "beeline_networks/Synthetic_LI.csv"
ns: 7

tfinal: 20.0
ntotal: 20
batch_size: 16
epoch_size: -1

lr: 1.e-3
lr_new: -1  #use -1 otehrwise
weight_decay: 1.e-5

n_mu: 3

n_exp_train: 20
n_exp_val: 5
n_exp_test: 5
noise: 0.01

n_iter_max: 100000
n_plot: 20 # frequency of callback

n_iter_buffer: 5000
n_iter_burnin: 100
n_iter_tol: 10000
convergence_tol: 1e-8

drop_range:
   lb: -0.1
   ub: 0.1

@jiweiqi
Copy link
Owner

jiweiqi commented Feb 18, 2021

The program takes about 7 minutes. about 2.1 it/s.
image
image

One testing condition
image

@jiweiqi
Copy link
Owner

jiweiqi commented Feb 18, 2021

A simple criterion to judge if the data is sufficient is whether there is a big gap between training loss and validation loss.
If there is, we shall increase the number of conditions.

@judyueshen
Copy link
Collaborator

Another good training example for LI network, with 10 training conditions and 10 ntotal

is_restart: false
network: "beeline_networks/Synthetic_LI.csv"
ns: 7

tfinal: 10.0
ntotal: 10
batch_size: 8
epoch_size: -1

lr: 1.e-3
weight_decay: 1.e-6

n_mu: 3

n_exp_train: 10
n_exp_val: 5
n_exp_test: 5
noise: 0.01

n_iter_max: 10000
n_plot: 20 # frequency of callback

n_iter_buffer: 50
n_iter_burnin: 100
n_iter_tol: 500
convergence_tol: 1e-8

drop_range:
   lb: -0.1
   ub: 0.1

loss_grad
p_inference_iter2645
i_exp_1

@DesmondYuan
Copy link
Collaborator Author

Curated model

@github-staff github-staff deleted a comment from p4u1d34n Oct 24, 2024
@github-staff github-staff deleted a comment from p4u1d34n Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested ready to close
Projects
None yet
Development

No branches or pull requests

4 participants
@jiweiqi @DesmondYuan @judyueshen and others