DefaultCacheStore: misleading error + no escape hatch when launch dir is on NFS/Lustre/GPFS

## Bug report 

When Nextflow is launched from a directory on a network filesystem that does not support POSIX file locking (NFS, Lustre, GPFS, BeeGFS, …), opening the LevelDB cache database fails with:

```
ERROR ~ Can't open cache DB: /path/to/launchDir/.nextflow/cache/<uuid>/db

Nextflow needs to be executed in a shared file system that supports file locks.
Alternatively, you can run it in a local directory and specify the shared work
directory by using the "-w" command line option.
```

This error manifests in \`DefaultCacheStore.openDb()\` ([\`DefaultCacheStore.groovy\`](https://github.com/nextflow-io/nextflow/blob/master/modules/nextflow/src/main/groovy/nextflow/cache/DefaultCacheStore.groovy)) when \`Iq80DBFactory.open()\` throws an exception that is not the known \`\"Unable to acquire lock\"\` variant.

## Root Cause

\`DefaultCacheStore\` always places the LevelDB cache directory under \`.nextflow/cache/\` **relative to the pipeline launch directory** (\`Const.appCacheDir\`). On HPC clusters, the launch directory is typically on a shared filesystem (NFS, Lustre, etc.) that does not support the file locking mechanisms LevelDB requires.

Three concrete problems exist today:

### 1. The underlying exception is swallowed silently

The caught \`Exception e\` is attached as a cause to the thrown \`IOException\` but is **never logged**. Developers and users looking at \`.nextflow.log\` cannot see what LevelDB actually reported, making diagnosis unnecessarily difficult.

```groovy
// current code – e is never logged
throw new IOException(msg, e)
```

### 2. No escape hatch when the launch directory cannot be moved

The only documented workaround is to run Nextflow from a local directory and use \`-w\` to redirect the **work** directory. However, \`-w\` only controls where task work directories are created — **it does not move the \`.nextflow/cache/\` directory**. Users who must launch from a shared path (e.g. a project directory enforced by HPC policies) have no supported way to redirect just the cache.

### 3. The error message is misleading

The message tells users to use \`-w\`, which does not actually solve the underlying problem (the cache DB is still on the shared filesystem). Users following this advice will hit the same error.

## Proposed Fixes

### Fix 1 – Log the root cause (diagnostic)

Log the caught exception at \`DEBUG\` level in the \`else\` branch so it always appears in \`.nextflow.log\`:

```groovy
log.debug \"Failed to open LevelDB cache at path: \$file -- cause: \${e.message}\", e
```

### Fix 2 – Add \`NXF_CACHE_DIR\` env-var support

Introduce a \`resolveCacheBaseDir()\` helper that checks the \`NXF_CACHE_DIR\` environment variable. When set, the variable overrides the default \`.nextflow/\` path, allowing users to redirect only the cache DB to a lock-capable local filesystem without moving the launch directory:

```bash
export NXF_CACHE_DIR=/tmp/nxf-cache-\$USER
nextflow run pipeline.nf -w /projects/exome/work
```

### Fix 3 – Rewrite the error message

Replace the current misleading message with one that:
- accurately names the problem (cache DB on a lock-incapable filesystem)
- lists both remedies: run from a local dir with \`-w\`, **or** set \`NXF_CACHE_DIR\`

## Affected file

\`modules/nextflow/src/main/groovy/nextflow/cache/DefaultCacheStore.groovy\`

## Environment

Reproducible whenever the pipeline is launched from NFS / Lustre / GPFS. Common in HPC environments where project directories live on a shared parallel filesystem.

## Proposed implementation

A reference implementation of all three fixes is available at:
https://github.com/matthdsm/nextflow/tree/fix/cache-db-nfs-error"


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DefaultCacheStore: misleading error + no escape hatch when launch dir is on NFS/Lustre/GPFS #6996

Bug report

Root Cause

1. The underlying exception is swallowed silently

2. No escape hatch when the launch directory cannot be moved

3. The error message is misleading

Proposed Fixes

Fix 1 – Log the root cause (diagnostic)

Fix 2 – Add `NXF_CACHE_DIR` env-var support

Fix 3 – Rewrite the error message

Affected file

Environment

Proposed implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DefaultCacheStore: misleading error + no escape hatch when launch dir is on NFS/Lustre/GPFS #6996

Description

Bug report

Root Cause

1. The underlying exception is swallowed silently

2. No escape hatch when the launch directory cannot be moved

3. The error message is misleading

Proposed Fixes

Fix 1 – Log the root cause (diagnostic)

Fix 2 – Add `NXF_CACHE_DIR` env-var support

Fix 3 – Rewrite the error message

Affected file

Environment

Proposed implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions