Skip to content

Commit 5bfe1c1

Browse files
committed
This PR sunset the Google LS executor
Signed-off-by: Paolo Di Tommaso <[email protected]>
1 parent 7405f51 commit 5bfe1c1

24 files changed

+6
-4217
lines changed

docs/executor.md

Lines changed: 0 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -144,33 +144,6 @@ Resource requests and other job characteristics can be controlled via the follow
144144

145145
See the {ref}`Google Cloud Batch <google-batch>` page for further configuration details.
146146

147-
(google-lifesciences-executor)=
148-
149-
## Google Life Sciences
150-
151-
:::{versionadded} 20.01.0
152-
:::
153-
154-
[Google Cloud Life Sciences](https://cloud.google.com/life-sciences) is a managed computing service that allows the execution of containerized workloads in the Google Cloud Platform infrastructure.
155-
156-
Nextflow provides built-in support for the Life Sciences API, which allows the seamless deployment of a Nextflow pipeline in the cloud, offloading the process executions as pipelines.
157-
158-
The pipeline processes must specify the Docker image to use by defining the `container` directive, either in the pipeline script or the `nextflow.config` file. Additionally, the pipeline work directory must be located in a Google Storage bucket.
159-
160-
To enable this executor, set `process.executor = 'google-lifesciences'` in the `nextflow.config` file.
161-
162-
Resource requests and other job characteristics can be controlled via the following process directives:
163-
164-
- {ref}`process-accelerator`
165-
- {ref}`process-cpus`
166-
- {ref}`process-disk`
167-
- {ref}`process-machineType`
168-
- {ref}`process-memory`
169-
- {ref}`process-resourcelabels`
170-
- {ref}`process-time`
171-
172-
See the {ref}`Google Life Sciences <google-lifesciences>` page for further configuration details.
173-
174147
(htcondor-executor)=
175148

176149
## HTCondor

docs/google.md

Lines changed: 3 additions & 153 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
## Credentials
66

7-
Credentials for submitting requests to the Google Cloud Batch and Cloud LifeSciences API are picked up from your environment using [Application Default Credentials](https://github.com/googleapis/google-auth-library-java#google-auth-library-oauth2-http). Application Default Credentials are designed to use the credentials most natural to the environment in which a tool runs.
7+
Credentials for submitting requests to the Google Cloud Batch API are picked up from your environment using [Application Default Credentials](https://github.com/googleapis/google-auth-library-java#google-auth-library-oauth2-http). Application Default Credentials are designed to use the credentials most natural to the environment in which a tool runs.
88

99
The most common case will be to pick up your end-user Google credentials from your workstation. You can create these by running the command:
1010

@@ -250,160 +250,23 @@ Currently, the following Nextflow directives are supported by the Google Batch e
250250
- {ref}`process-memory`
251251
- {ref}`process-time`
252252

253-
(google-lifesciences)=
254-
255-
## Cloud Life Sciences
256-
257-
:::{versionadded} 20.01.0-edge
258-
:::
259-
260-
:::{note}
261-
In versions of Nextflow prior to `21.04.0`, the following variables must be defined in your system environment:
262-
263-
```bash
264-
export NXF_VER=20.01.0
265-
export NXF_MODE=google
266-
```
267-
:::
268-
269-
[Cloud Life Sciences](https://cloud.google.com/life-sciences/) is a managed computing service that allows the execution of containerized workloads in the Google Cloud Platform infrastructure.
270-
271-
Nextflow provides built-in support for Cloud Life Sciences, allowing the seamless deployment of Nextflow pipelines in the cloud, in which tasks are offloaded to the Cloud Life Sciences service.
272-
273-
Read the {ref}`Google Life Sciences executor <google-lifesciences-executor>` page to learn about the `google-lifesciences` executor in Nextflow.
274-
275-
:::{warning}
276-
This API works well for coarse-grained workloads (i.e. long-running jobs). It's not suggested the use this feature for pipelines spawning many short lived tasks.
277-
:::
278-
279-
(google-lifesciences-config)=
280-
281-
### Configuration
282-
283-
Make sure to have defined in your environment the `GOOGLE_APPLICATION_CREDENTIALS` variable. See the section [Credentials](#credentials) for details.
284-
285-
:::{tip}
286-
Make sure to enable the Cloud Life Sciences API beforehand. To learn how to enable it follow [this link](https://cloud.google.com/life-sciences/docs/quickstart).
287-
:::
288-
289-
Create a `nextflow.config` file in the project root directory. The config must specify the following parameters:
290-
291-
- Google Life Sciences as Nextflow executor
292-
- The Docker container image(s) for pipeline tasks
293-
- The Google Cloud project ID
294-
- The Google Cloud region or zone where the Compute Engine VMs will be executed.
295-
You need to specify one or the other, *not* both. Multiple regions or zones can be specified as a comma-separated list, e.g. `google.zone = 'us-central1-f,us-central-1-b'`.
296-
297-
Example:
298-
299-
```groovy
300-
process {
301-
executor = 'google-lifesciences'
302-
container = 'your/container:latest'
303-
}
304-
305-
google {
306-
project = 'your-project-id'
307-
zone = 'europe-west1-b'
308-
}
309-
```
310-
311-
Notes:
312-
- A container image must be specified to execute processes. You can use a different Docker image for each process using one or more {ref}`config-process-selectors`.
313-
- Make sure to specify the project ID, not the project name.
314-
- Make sure to specify a location where Google Life Sciences is available. Refer to the [Google Cloud documentation](https://cloud.google.com/life-sciences/docs/concepts/locations) for details.
315-
316-
Read the {ref}`Google configuration<config-google>` section to learn more about advanced configuration options.
317-
318-
### Process definition
319-
320-
Processes can be defined as usual and by default the `cpus` and `memory` directives are used to instantiate a custom machine type with the specified compute resources. If `memory` is not specified, 1GB of memory is allocated per cpu. A persistent disk will be created with size corresponding to the `disk` directive. If `disk` is not specified, the instance default is chosen to ensure reasonable I/O performance.
321-
322-
The process `machineType` directive may optionally be used to specify a predefined Google Compute Platform [machine type](https://cloud.google.com/compute/docs/machine-types) If specified, this value overrides the `cpus` and `memory` directives. If the `cpus` and `memory` directives are used, the values must comply with the allowed custom machine type [specifications](https://cloud.google.com/compute/docs/instances/creating-instance-with-custom-machine-type#specifications) . Extended memory is not directly supported, however high memory or cpu predefined instances may be utilized using the `machineType` directive
323-
324-
Examples:
325-
326-
```nextflow
327-
process custom_resources_task {
328-
cpus 8
329-
memory '40 GB'
330-
disk '200 GB'
331-
332-
script:
333-
"""
334-
your_command --here
335-
"""
336-
}
337-
338-
process predefined_resources_task {
339-
machineType 'n1-highmem-8'
340-
341-
script:
342-
"""
343-
your_command --here
344-
"""
345-
}
346-
```
347-
348-
### Pipeline execution
349-
350-
The pipeline can be launched either in a local computer or a cloud instance. Pipeline input data can be stored either locally or in a Google Storage bucket.
351-
352-
The pipeline execution must specify a Google Storage bucket where the workflow's intermediate results are stored using the `-work-dir` command line options. For example:
353-
354-
```bash
355-
nextflow run <script or project name> -work-dir gs://my-bucket/some/path
356-
```
357-
358-
:::{tip}
359-
Any input data *not* stored in a Google Storage bucket will be automatically transferred to the pipeline work bucket. Use this feature with caution, being careful to avoid unnecessary data transfers.
360-
:::
361-
362-
### Preemptible instances
363-
364-
Preemptible instances are supported adding the following setting in the Nextflow config file:
365-
366-
```groovy
367-
google {
368-
lifeSciences.preemptible = true
369-
}
370-
```
371-
372-
Since this type of virtual machines can be retired by the provider before the job completion, it is advisable to add the following retry strategy to your config file to instruct Nextflow to automatically re-execute a job if the virtual machine was terminated preemptively:
373-
374-
```groovy
375-
process {
376-
errorStrategy = { task.exitStatus==14 ? 'retry' : 'terminate' }
377-
maxRetries = 5
378-
}
379-
```
380-
381-
:::{note}
382-
Preemptible instances have a [runtime limit](https://cloud.google.com/compute/docs/instances/preemptible) of 24 hours.
383-
:::
384-
385-
:::{tip}
386-
For an exhaustive list of error codes, refer to the official Google Life Sciences [documentation](https://cloud.google.com/life-sciences/docs/troubleshooting#error_codes).
387-
:::
388-
389253
### Hybrid execution
390254

391-
Nextflow allows the use of multiple executors in the same workflow. This feature enables the deployment of hybrid workloads, in which some jobs are executed in the local computer or local computing cluster, and some jobs are offloaded to Google Cloud (either Google Batch or Google Life Sciences).
255+
Nextflow allows the use of multiple executors in the same workflow. This feature enables the deployment of hybrid workloads, in which some jobs are executed in the local computer or local computing cluster, and some jobs are offloaded to Google Cloud.
392256

393257
To enable this feature, use one or more {ref}`config-process-selectors` in your Nextflow configuration file to apply the Google Cloud executor to the subset of processes that you want to offload. For example:
394258

395259
```groovy
396260
process {
397261
withLabel: bigTask {
398-
executor = 'google-batch' // or 'google-lifesciences'
262+
executor = 'google-batch'
399263
container = 'my/image:tag'
400264
}
401265
}
402266
403267
google {
404268
project = 'your-project-id'
405269
location = 'us-central1' // for Google Batch
406-
// zone = 'us-central1-a' // for Google Life Sciences
407270
}
408271
```
409272

@@ -427,16 +290,3 @@ Nextflow will automatically manage the transfer of input and output files betwee
427290

428291
- Currently, it's not possible to specify a disk type different from the default one assigned by the service depending on the chosen instance type.
429292

430-
### Troubleshooting
431-
432-
- Make sure to enable the Compute Engine API, Life Sciences API and Cloud Storage API in the [APIs & Services Dashboard](https://console.cloud.google.com/apis/dashboard) page.
433-
434-
- Make sure to have enough compute resources to run your pipeline in your project [Quotas](https://console.cloud.google.com/iam-admin/quotas) (i.e. Compute Engine CPUs, Compute Engine Persistent Disk, Compute Engine In-use IP addresses, etc).
435-
436-
- Make sure your security credentials allow you to access any Google Storage bucket where input data and temporary files are stored.
437-
438-
- When a job fails, you can check the `google/` directory in the task work directory (in the bucket storage), which contains useful information about the job execution. To enable the creation of this directory, set `google.lifeSciences.debug = true` in the Nextflow config.
439-
440-
- You can enable the optional SSH daemon in the job VM by setting `google.lifeSciences.sshDaemon = true` in the Nextflow config.
441-
442-
- Make sure you are choosing a `location` where the [Cloud Life Sciences API is available](https://cloud.google.com/life-sciences/docs/concepts/locations), and a `region` or `zone` where the [Compute Engine API is available](https://cloud.google.com/compute/docs/regions-zones/).

docs/reference/config.md

Lines changed: 1 addition & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -816,7 +816,7 @@ The following settings are available:
816816

817817
## `google`
818818

819-
The `google` scope allows you to configure the interactions with Google Cloud, including Google Cloud Batch, Google Life Sciences, and Google Cloud Storage.
819+
The `google` scope allows you to configure the interactions with Google Cloud, including Google Cloud Batch and Google Cloud Storage.
820820

821821
Read the {ref}`google-page` page for more information.
822822

@@ -956,52 +956,6 @@ The following settings are available for Cloud Life Sciences:
956956
`google.zone`
957957
: The Google Cloud zone where jobs are executed. Multiple zones can be provided as a comma-separated list. Cannot be used with the `google.region` option. See the [Google Cloud documentation](https://cloud.google.com/compute/docs/regions-zones/) for a list of available regions and zones.
958958

959-
`google.lifeSciences.bootDiskSize`
960-
: Set the size of the virtual machine boot disk e.g `50.GB` (default: none).
961-
962-
`google.lifeSciences.copyImage`
963-
: The container image run to copy input and output files. It must include the `gsutil` tool (default: `google/cloud-sdk:alpine`).
964-
965-
`google.lifeSciences.cpuPlatform`
966-
: Set the minimum CPU Platform e.g. `'Intel Skylake'`. See [Specifying a minimum CPU Platform for VM instances](https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#specifications) (default: none).
967-
968-
`google.lifeSciences.debug`
969-
: When `true` copies the `/google` debug directory in that task bucket directory (default: `false`).
970-
971-
`google.lifeSciences.keepAliveOnFailure`
972-
: :::{versionadded} 21.06.0-edge
973-
:::
974-
: When `true` and a task complete with an unexpected exit status the associated compute node is kept up for 1 hour. This options implies `sshDaemon=true` (default: `false`).
975-
976-
`google.lifeSciences.network`
977-
: :::{versionadded} 21.03.0-edge
978-
:::
979-
: Set network name to attach the VM's network interface to. The value will be prefixed with `global/networks/` unless it contains a `/`, in which case it is assumed to be a fully specified network resource URL. If unspecified, the global default network is used.
980-
981-
`google.lifeSciences.preemptible`
982-
: When `true` enables the usage of *preemptible* virtual machines or `false` otherwise (default: `true`).
983-
984-
`google.lifeSciences.serviceAccountEmail`
985-
: :::{versionadded} 20.05.0-edge
986-
:::
987-
: Define the Google service account email to use for the pipeline execution. If not specified, the default Compute Engine service account for the project will be used.
988-
989-
`google.lifeSciences.subnetwork`
990-
: :::{versionadded} 21.03.0-edge
991-
:::
992-
: Define the name of the subnetwork to attach the instance to must be specified here, when the specified network is configured for custom subnet creation. The value is prefixed with `regions/subnetworks/` unless it contains a `/`, in which case it is assumed to be a fully specified subnetwork resource URL.
993-
994-
`google.lifeSciences.sshDaemon`
995-
: When `true` runs SSH daemon in the VM carrying out the job to which it's possible to connect for debugging purposes (default: `false`).
996-
997-
`google.lifeSciences.sshImage`
998-
: The container image used to run the SSH daemon (default: `gcr.io/cloud-genomics-pipelines/tools`).
999-
1000-
`google.lifeSciences.usePrivateAddress`
1001-
: :::{versionadded} 20.03.0-edge
1002-
:::
1003-
: When `true` the VM will NOT be provided with a public IP address, and only contain an internal IP. If this option is enabled, the associated job can only load docker images from Google Container Registry, and the job executable cannot use external services other than Google APIs (default: `false`).
1004-
1005959
`google.storage.delayBetweenAttempts`
1006960
: :::{versionadded} 21.06.0-edge
1007961
:::

docs/reference/process.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -520,7 +520,7 @@ process runThisWithDocker {
520520
```
521521

522522
:::{warning}
523-
This feature is not supported by the {ref}`k8s-executor` and {ref}`google-lifesciences-executor` executors.
523+
This feature is not supported by the {ref}`k8s-executor` executor.
524524
:::
525525

526526
(process-cpus)=
@@ -682,7 +682,6 @@ The following executors are available:
682682
| `awsbatch` | [AWS Batch](https://aws.amazon.com/batch/) service |
683683
| `azurebatch` | [Azure Batch](https://azure.microsoft.com/en-us/services/batch/) service |
684684
| `condor` | [HTCondor](https://research.cs.wisc.edu/htcondor/) job scheduler |
685-
| `google-lifesciences` | [Google Genomics Pipelines](https://cloud.google.com/life-sciences) service |
686685
| `k8s` | [Kubernetes](https://kubernetes.io/) cluster |
687686
| `local` | The computer where `Nextflow` is launched |
688687
| `lsf` | [Platform LSF](http://en.wikipedia.org/wiki/Platform_LSF) job scheduler |
@@ -821,7 +820,7 @@ See also: [resourceLabels](#resourcelabels)
821820
:::{versionadded} 19.07.0
822821
:::
823822

824-
The `machineType` can be used to specify a predefined Google Compute Platform [machine type](https://cloud.google.com/compute/docs/machine-types) when running using the {ref}`Google Batch <google-batch-executor>` or {ref}`Google Life Sciences <google-lifesciences-executor>` executor, or when using the autopools feature of the {ref}`Azure Batch executor<azurebatch-executor>`.
823+
The `machineType` can be used to specify a predefined Google Compute Platform [machine type](https://cloud.google.com/compute/docs/machine-types) when running using the {ref}`Google Batch <google-batch-executor>`, or when using the autopools feature of the {ref}`Azure Batch executor<azurebatch-executor>`.
825824

826825
This directive is optional and if specified overrides the cpus and memory directives:
827826

@@ -1379,7 +1378,6 @@ Resource labels are currently supported by the following executors:
13791378
- {ref}`awsbatch-executor`
13801379
- {ref}`azurebatch-executor`
13811380
- {ref}`google-batch-executor`
1382-
- {ref}`google-lifesciences-executor`
13831381
- {ref}`k8s-executor`
13841382

13851383
:::{versionadded} 23.09.0-edge

plugins/nf-google/build.gradle

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ dependencies {
3737
compileOnly 'org.slf4j:slf4j-api:2.0.16'
3838
compileOnly 'org.pf4j:pf4j:3.12.0'
3939

40-
api 'com.google.apis:google-api-services-lifesciences:v2beta-rev20210527-1.31.5'
4140
api 'com.google.auth:google-auth-library-oauth2-http:0.18.0'
4241
api 'com.google.cloud:google-cloud-batch:0.53.0'
4342
api 'com.google.cloud:google-cloud-logging:3.20.6'

0 commit comments

Comments
 (0)