-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Raised by @violetbrina
A couple of the submodules in the analysis runner assume the existence of a config on import which it shouldn't be doing.
Discovered while writing unittests for prod pipes.
Initial offenders found (but there coule be more) are:
- analysis_runner/dataproc.py
_config = get_config()
ACCESS_LEVEL = _config['workflow']['access_level']
DATASET = _config['workflow']['dataset']
DATASET_GCP_PROJECT = _config['workflow']['dataset_gcp_project']
GCLOUD_CONFIG_SET_PROJECT = f'gcloud config set project {DATASET_GCP_PROJECT}'- analysis_runner/examples/cromwell_from_hail_batch.py
_config = get_config()
BILLING_PROJECT = _config['hail']['billing_project']
DATASET = _config['workflow']['dataset']
ACCESS_LEVEL = _config['workflow']['access_level']These should be updates to not pull values on import. The examples one is probably less important. But it's worth doing another look to see if any other modules do the same.
I propose a cached getenv() or equivalent function to pull those environment variables rather than try and load on import.
^ That approach seems fine, otherwise would throw in just making it call a function to get those values if required. We'd also need to look for any places that BILLING_PROJECT is imported from analysis_runner.dataproc for example.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status