Running into out of memory issues when working with a large number of files. #4711
-
Bug reportWhen a large number of files (>5000) are needed to be moved/copied over using publishDir, nextflow fails with the following error:
From what I can see, the files are created in the workDir, but the failure only happens during the publishDir directive is being enforced. I have already tried changing different parameters using NXF_OPTS, _JAVA_OPTIONS among others, none of them seem to help fix this issue. Expected behavior and actual behaviorExpected behavior: output files are copied/moved to the publishDir.
with the above detailed error message printed in the log file. Steps to reproduce the problemTo simulate the scenario, I created the following dummy script and the dummy pipeline that just creates 20,000 files and copies it to another directory using publishDir. I get the same error as above when running it. Dummy script and nextflow pipeline below:
Program outputEnvironment
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 11 replies
-
How much CPUs and memory do you have? And what JVM settings have you tried? |
Beta Was this translation helpful? Give feedback.
-
I ran into this same thread exhaustion issue on Java 21 with a large number of output files. |
Beta Was this translation helpful? Give feedback.
-
Folks, we are going to disable virtual threads by default in the next edge release since there are still some scaling problems. It looks like for large runs with many published outputs, we need to add some queue limits which are not provided by virtual threads on their own. See #4995 for further discussion |
Beta Was this translation helpful? Give feedback.
I see you are using 23.10 and Java 21 which enables virtual threads by default. So when the process publishes 20k files it will create 20k virtual threads at once, there is no limit. But virtual threads are lightweight, you should be able to have millions of virtual threads easily
In any case, it's worth trying without virtual threads:
export NXF_ENABLE_VIRTUAL_THREADS=false