Skip to content

Inconsistent matUtils extract pathing: nwks segfault, JSONs hang infinitely, pbs write to invalid paths (not a dupe of #211) #437

@aofarrel

Description

@aofarrel

matUtils extract handles outfile paths inconsistently. Per #211, -d / is required to get absolute paths working, but it doesn't seem that actually help in the case of -o (pb) or -j (JSON). There appears to be no way to get matUtils extract -o and matUtils extract -j to work with absolute outfile paths.

Observed behavior

command behavior expected?
matUtils extract -i /absolute/path/input.pb -d / -o /absolute/path/output.pb prints saved to ///absolute/path/output.pb, does not throw error, file not written because path is invalid no
matUtils extract -i /absolute/path/input.pb -d / -j /absolute/path/output.json infinitely hangs no
matUtils extract -i /absolute/path/input.pb -d / -t /absolute/path/output.nwk silently writes to /absolute/path/output.nwk yes (it is inconsistent the command doesn't print the output path though)
matUtils extract -i /absolute/path/input.pb -o /absolute/path/output.pb prints saved to /workdir//absolute/path/output.pb, does not throw error, file not written because path invalid documented in #211 but should throw an error
matUtils extract -i /absolute/path/input.pb -j /absolute/path/output.json infinitely hangs no; user is still doing it wrong, but an infinite hang is inconsistent, racks up cloud costs, and not documented in #211
matUtils extract -i /absolute/path/input.pb -t /absolute/path/output.nwk segfaults documented in #211 but should have a better error

Reproducing this error

The .pb I'm using is the .pb on https://usher-wiki.readthedocs.io/en/latest/matUtils.html#introduce

I'm running on Mac OS within a Docker image I created (ashedpotatoes/ashedpotatoes/usher-plus:0.6.6_rev13) which contains matUtils (v0.6.6). The Docker image and the host machine are both x86 so this isn't ARM shenanigans. This Docker image is used in production pipeline and runs other matUtils commands such as introduce with no issue, so this is unlikely to be an installation issue.

Specific examples

In below, the workdir is /HOME/ash and the docker volume is bound to /dvol/ within the container. I also tested this with absolute paths inside and outside the bind volume, such as /HOME/ash/foo/ so this behavior is unlikely to be Docker file permission shenanigans.

JSON example (control-C'd after ~20 minutes)

> matUtils extract -i /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.pb -j /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.json
Loading input MAT file /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.pb.
Completed in 3627 msec 

Checking for and applying sample selection arguments
Completed in 0 msec 

No sample selection arguments passed; using full input tree for further output.
Completed in 158 msec 

Generating JSON of final tree

nwk segfault example:

> matUtils extract -i /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.pb -t /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.nwk
Loading input MAT file /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.pb.
Completed in 3509 msec 

Checking for and applying sample selection arguments
Completed in 0 msec 

No sample selection arguments passed; using full input tree for further output.
Completed in 143 msec 

Generating Newick file of final tree
Segmentation fault

pb to pb (no error but file doesn't exist):

> matUtils extract -i /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.pb -o /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin_copy.pb
Loading input MAT file /dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin.pb.
Completed in 3573 msec 

Checking for and applying sample selection arguments
Completed in 0 msec 

No sample selection arguments passed; using full input tree for further output.
Completed in 142 msec 

Saving output MAT file /HOME/ash//dvol/heck/public-2021-06-09.all.masked.nextclade.pangolin_copy.pb.
Completed in 5572 msec 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions