Responds to Google Cloud Storage(GCS) file upload and runs a Data Loss Prevention(DLP) job on the uploaded file. Reports to Pub/Sub if DLP was found. See detailed flow here.
- The workflow is triggered once a new object is uploaded to a GCS bucket.
- The workflow creates and runs a DLP job to inspect the uploaded object.
- The workflow periodically checks the status of the DLP job until the job is complete.
- Once the DLP job completes, The workflow inspects the job results.
- If the DLP job found any DLP issues, the workflow will post a message to Pub/Sub with the Job ID.
- GCP project.
- Service account with Cloud Functions Invoker, Pub/Sub Admin, and Workflows Admin permissions.
- Pub/Sub topic (workflows-demo) and subscription.
- Deploy all functions in the 'functions' folder. Use the service account created in the Prerequisites section.
- Configure 'trigger-dlp-workflow' function to be triggered by a file upload to GCS.
- Modify 'trigger-dlp-workflow.py' function by adjusting the 'parent' value to point to the workflow deployed in step #1 above.
- Modify 'dlp-gcs-file.yaml' by replacing the values in the 'initVariables' step to match your environment and by replacing the URLs to the functions.
- Deploy dlp-gcs-workflow workflow using the dlp-gcs-workflow.yaml file. Use the service account created in the Prerequisites section.