-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deal with accessions with non-existing files #139
Comments
If you just want to ignore the errors, you can create a local nextflow configuration: process {
withName: SRA_IDS_TO_RUNINFO {
errorStrategy = 'ignore'
}
} |
Did this solution work for you @bmlab-sg ? We could try to incorporate ignoring these sorts of ids via the pipeline but we would need some sort of way to detect this via the metadata or otherwise. |
@drpatelh - yes, that solution mostly solves this issue. |
Cool. Thanks for the update. We can see if these metadata fields are exposed so we can add conditional filtering to the pipeline in these scenarios so it doesn't hard fail. |
I am unable to reproduce this issue anymore. This could be due to the changes made to the ENA API recently as fixed in #148 I am now getting
|
Hello @drpatelh, `ERROR ~ Error executing process > 'NFCORE_FETCHNGS:SRA:SRA_IDS_TO_RUNINFO (SRR29688955)' Caused by: Command executed: echo SRR29688955 > id.txt cat <<-END_VERSIONS > versions.yml Command exit status: Command output: Command error: |
I have encountered this same issue for the dataset PRJNA898600. I have also tried running just one sample from the project as well and in multiple ways (different identifiers: SRR22198886, SRS15675991, SRX18177158) and tried running the pipeline with ftp and sratools for -- download_method The only variation I find is when I run with the ftp method, it technically completes the SRA_IDS_TO_RUNINFO process and fails at the SRA_RUNINFO_TO_FTP instead, but still has the underlying issue of not finding the dataset it seems (when exploring the "/work/" directory, the .runinfo.tsv is empty regardless of the way that I try to run the pipeline) If it's helpful, here is the slight variation that I get with the FTP download method
This is with nf-core/fetchngs v1.12.0 and nextflow version 24.04.3 |
Description of feature
Hi,
In SRA some of the run accessions have no associated files.
For example bioproject
PRJEB18755
has several runs that are total ghosts:ERR2013571
,ERR2013572
,ERR2013573
, ..., while other are fine.When these ghost accessions are provided in the input, the pipeline will first retry:
and then terminate with errors:
Of course one thing that can be done is to filter first these entries before feeding to the pipeline, but it will be great if these errors can be ignored.
Or maybe there is an option like that already that I am missing?
Thanks for any info on that, it will be extremely helpful to be able to easily deal with it!
The text was updated successfully, but these errors were encountered: