Galaxy format #42

soulios · 2024-11-06T15:43:13Z

Saving now only weights and not whole models though callbacks with N epochs frequency.
Fixed issue with the history plots.

bernt-matthias · 2024-11-07T10:23:10Z

Can you fix the error: dfpl: error: the following arguments are required: --test_path, --preds_path?

soulios · 2024-11-13T10:15:52Z

Can you fix the error: dfpl: error: the following arguments are required: --test_path, --preds_path?

Fixed it. ready for review. @mai00fti @bernt-matthias

dfpl/single_label_model.py

dfpl/__main__.py

dfpl/autoencoder.py

dfpl/feedforwardNN.py

dfpl/single_label_model.py

dfpl/vae.py

example/train.json

dfpl/__main__.py

tom-mohr · 2024-11-14T13:46:09Z

If epochs < 10, the output looks like this:

.
├── autoencoder
│   ├── autoencoder_weights.h5.history.csv
│   ├── autoencoder_weights.h5.history.svg
│   └── encoder_weights.h5
├── conda_activate.log
├── config.json
└── model
    ├── AR
    │   ├── history.csv
    │   └── history.svg
    └── train.log

If epochs >= 10, the output looks like this:

.
├── autoencoder
│   ├── autoencoder_weights.h5.history.csv
│   ├── autoencoder_weights.h5.history.svg
│   └── encoder_weights.h5
├── conda_activate.log
├── config.json
└── model
    ├── AR
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── Aromatase
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── ED
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── ER
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── GR
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── PPARg
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── single_label_random_model.evaluation.csv
    ├── TR
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    └── train.log

This is because in the first case, an error is thrown when the first h5 file is created:

FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = '/tmp/tmpvbmwtqa8/job_working_directory/000/4/working/model/AR/model_weights.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

This originates from single_label_model.py:555:

    callback_model.load_weights(filepath=checkpoint_model_weights_path)

soulios · 2024-11-14T14:37:10Z

If epochs < 10, the output looks like this:

.
├── autoencoder
│   ├── autoencoder_weights.h5.history.csv
│   ├── autoencoder_weights.h5.history.svg
│   └── encoder_weights.h5
├── conda_activate.log
├── config.json
└── model
    ├── AR
    │   ├── history.csv
    │   └── history.svg
    └── train.log

If epochs >= 10, the output looks like this:

.
├── autoencoder
│   ├── autoencoder_weights.h5.history.csv
│   ├── autoencoder_weights.h5.history.svg
│   └── encoder_weights.h5
├── conda_activate.log
├── config.json
└── model
    ├── AR
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── Aromatase
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── ED
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── ER
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── GR
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── PPARg
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    ├── single_label_random_model.evaluation.csv
    ├── TR
    │   ├── auc_data.png
    │   ├── history.csv
    │   ├── history.svg
    │   ├── model_weights.hdf5
    │   ├── predicted.testdata.aucdata.csv
    │   ├── predicted.testdata.csv
    │   └── predicted.testdata.prec_rec_f1.csv
    └── train.log

This is because in the first case, an error is thrown when the first h5 file is created:

FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = '/tmp/tmpvbmwtqa8/job_working_directory/000/4/working/model/AR/model_weights.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

This originates from single_label_model.py:555:

    callback_model.load_weights(filepath=checkpoint_model_weights_path)

Yes, because the period in which the model is saved is every 10 epochs because if it was every epoch it would add a lot of time overhead. This is intended. But maybe we should limit the opts.epochs to take values above 10 and the same goes for autoencoder but it is saved every 5 epochs, so opts.aeEpochs should accept minimum 5 as a value.

bernt-matthias · 2024-11-14T15:42:26Z

Yes, because the period in which the model is saved is every 10 epochs because if it was every epoch it would add a lot of time overhead. This is intended. But maybe we should limit the opts.epochs to take values above 10 and the same goes for autoencoder but it is saved every 5 epochs, so opts.aeEpochs should accept minimum 5 as a value.

Sounds reasonable.

@tom-mohr has already set the min for epochs in the Galaxy tool.. then we do the same for aeEpochs. Should we also do this in the python code, e.g. via argparse .. but then the json input is again not covered?
Edit: Should the values be a multiple of 10 (resp. 5)?

Just thinking loud (and not suggesting that we should do this in the PR): We can also give the list of CLI arguments to parser.parse_args (e.g. parser.parse_args(['--foo', 'FOO'])). Could we generate such a list from the json content. Then all argument parsing would be via argparse....

soulios added 6 commits January 1, 2024 20:51

update options and saving for encoders

e376b7e

Merge branch 'master' of https://github.com/yigbt/deepFPlearn

31f48a4

Merge branch 'master' of https://github.com/yigbt/deepFPlearn

baa6393

changing only code for the necessary output for Galaxy

2dfb8e2

changing back options and callbacks

21bb423

added lrdecay and vislatent

f3c74af

soulios added 2 commits November 11, 2024 13:29

fixed predictgnn error

8d5f6da

fixed predictgnn error

b3949d7

bernt-matthias reviewed Nov 13, 2024

View reviewed changes

soulios added 3 commits November 18, 2024 00:17

adressed PR comments

c83d334

added two tests for autoencoder training and predicting in the pr.yml

a1890f2

added the boolean flag to the argument

785eb54

mai00fti merged commit e73d36c into yigbt:master Nov 28, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Galaxy format #42

Galaxy format #42

Uh oh!

soulios commented Nov 6, 2024

Uh oh!

bernt-matthias commented Nov 7, 2024

Uh oh!

soulios commented Nov 13, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tom-mohr commented Nov 14, 2024

Uh oh!

soulios commented Nov 14, 2024 •

edited

Loading

Uh oh!

bernt-matthias commented Nov 14, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Galaxy format #42

Galaxy format #42

Uh oh!

Conversation

soulios commented Nov 6, 2024

Uh oh!

bernt-matthias commented Nov 7, 2024

Uh oh!

soulios commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tom-mohr commented Nov 14, 2024

Uh oh!

soulios commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bernt-matthias commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

soulios commented Nov 13, 2024 •

edited

Loading

soulios commented Nov 14, 2024 •

edited

Loading

bernt-matthias commented Nov 14, 2024 •

edited

Loading