Skip to content

Commit 35f88fd

Browse files
authored
Merge pull request #1986 from cmu-delphi/deployment-details
Docs on new signal deployment details
2 parents 3a6c411 + 01c9ebe commit 35f88fd

File tree

1 file changed

+144
-14
lines changed

1 file changed

+144
-14
lines changed

_template_python/INDICATOR_DEV_GUIDE.md

+144-14
Original file line numberDiff line numberDiff line change
@@ -278,7 +278,7 @@ This example is taken from [`hhs_hosp`](https://github.com/cmu-delphi/covidcast-
278278

279279
The column is described [here](https://cmu-delphi.github.io/delphi-epidata/api/missing_codes.html).
280280

281-
#### Testing
281+
#### Local testing
282282

283283
As a general rule, it helps to decompose your functions into operations for which you can write unit tests.
284284
To run the tests, use `make test` in the top-level indicator directory.
@@ -411,29 +411,159 @@ Next, the `acquisition.covidcast` component of the `delphi-epidata` codebase doe
411411
12. `value_updated_timestamp`: now
412412
2. Update the `epimetric_latest` table with any new keys or new versions of existing keys.
413413

414+
Consider what settings to use in the `params.json.template` file in accordance with how you want to run the indicator and acquisition.
415+
Pay attention to the receiving directory, as well as how you can store credentials in vault.
416+
Refer to [this guide](https://docs.google.com/document/d/1Bbuvtoxowt7x2_8USx_JY-yTo-Av3oAFlhyG-vXGG-c/edit#heading=h.8kkoy8sx3t7f) for more vault info.
417+
418+
### CI/CD
419+
420+
* Add module name to the `build` job in `.github/workflows/python-ci.yml`.
421+
This allows github actions to run on this indicator code, which includes unit tests and linting.
422+
* Add top-level directory name to `indicator_list` in `Jenkinsfile`.
423+
This allows your code to be automatically deployed to staging after your branch is merged to main, and deployed to prod after `covidcast-indicators` is released.
424+
* Create `ansible/templates/{top_level_directory_name}-params-prod.json.j2` based on your `params.json.template` with some adjustment:
425+
* "export_dir": "/common/covidcast/receiving/{data-source-name}"
426+
* "log_filename": "/var/log/indicators/{top_level_directory_name}.log"
427+
428+
Pay attention to the receiving/export directory, as well as how you can store credentials in vault.
429+
Refer to [this guide](https://docs.google.com/document/d/1Bbuvtoxowt7x2_8USx_JY-yTo-Av3oAFlhyG-vXGG-c/edit#heading=h.8kkoy8sx3t7f) for more vault info.
430+
414431
### Staging
415432

416-
After developing the pipeline code, but before deploying in development, the pipeline should be run on staging for at least a week.
417-
This involves setting up some cronicle jobs as follows:
433+
After developing the pipeline code, but before deploying in development, the pipeline should be tested on staging.
434+
Indicator runs should be set up to run automatically daily for at least a week.
418435

419-
first the indicator run
436+
The indicator run code is automatically deployed on staging after your branch is merged into `main`.
437+
After merging, make sure you have proper access to Cronicle and staging server `app-mono-dev-01.delphi.cmu.edu` _and_ can see your code on staging at `/home/indicators/runtime/`.
420438

421-
Then the acquisition run
439+
Then, on Cronicle, create two jobs: one to run the indicator and one to load the output csv files into database.
422440

423-
See [@korlaxxalrok](https://www.github.com/korlaxxalrok) or [@minhkhul](https://www.github.com/minhkhul) for more information.
441+
We start by setting up the acquisition job.
424442

425-
https://cronicle-prod-01.delphi.cmu.edu/#Schedule?sub=edit_event&id=elr5clgy6rs
443+
#### Acquisition job
426444

427-
https://cronicle-prod-01.delphi.cmu.edu/#Schedule?sub=edit_event&id=elr5ctl7art
445+
The indicator job loads the location of the relevant csv output files into chained data, which this acquisition job then loads into our database.
428446

429-
Note the staging hostname and how the acquisition job is chained to run right after the indicator job.
430-
Do a few test runs.
447+
Example script:
431448

432-
If everything goes well (check staging db if data is ingested properly), make a prod version of the indicator run job and use that to run indicator on a daily basis.
449+
```
450+
#!/usr/bin/python3
433451
434-
Another thing to do is setting up the params.json template file in accordance with how you want to run the indicator and acquisition.
435-
Pay attention to the receiving directory, as well as how you can store credentials in vault.
436-
Refer to [this guide](https://docs.google.com/document/d/1Bbuvtoxowt7x2_8USx_JY-yTo-Av3oAFlhyG-vXGG-c/edit#heading=h.8kkoy8sx3t7f) for more vault info.
452+
import subprocess
453+
import json
454+
455+
str_data = input()
456+
print(str_data)
457+
458+
data = json.loads(str_data, strict=False)
459+
chain_data = data["chain_data"]
460+
user = chain_data["user"]
461+
host = chain_data["host"]
462+
acq_ind_name = chain_data["acq_ind_name"]
463+
464+
cmd = f'''ssh -T -l {user} {host} "cd ~/driver && python3 -m delphi.epidata.acquisition.covidcast.csv_to_database --data_dir=/common/covidcast --indicator_name={acq_ind_name} --log_file=/var/log/epidata/csv_upload_{acq_ind_name}.log"'''
465+
466+
std_err, std_out = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
467+
468+
print(std_err.decode('UTF-8'))
469+
print(std_out.decode('UTF-8'))
470+
```
471+
472+
#### Indicator run job
473+
474+
This job signs into our staging server via ssh, runs the indicator, producing csv files as output.
475+
476+
Example script:
477+
478+
```
479+
#!/bin/sh
480+
481+
# vars
482+
user='automation'
483+
host='app-mono-dev-01.delphi.cmu.edu'
484+
ind_name='nchs_mortality'
485+
acq_ind_name='nchs-mortality'
486+
487+
# chain_data to be sent to acquisition job
488+
chain_data=$(jo chain_data=$(jo acq_ind_name=${acq_ind_name} ind_name=${ind_name} user=${user} host=${host}));
489+
echo "${chain_data}";
490+
491+
ssh -T -l ${user} ${host} "sudo -u indicators -s bash -c 'cd /home/indicators/runtime/${ind_name} && env/bin/python -m delphi_${ind_name}'";
492+
```
493+
494+
Note the staging hostname in `host` and how the acquisition job is chained to run right after the indicator job.
495+
496+
Note that `ind_name` variable here refer to the top-level directory name where code is located, while `acq_ind_name` refer to the directory name where output csv files are located, which corresponds to the name of `source` column in our database, as mentioned in step 3.
497+
498+
To automatically run acquisition job right after indicator job finishes successfully:
499+
500+
1. In `Plugin` section, select `Interpret JSON in Output`.
501+
2. In `Chain Reaction` section, select your acquisition run job below to `Run Event on Success`
502+
503+
You can read more about how the `chain_data` json object in the script above can be used in our subsequent acquisition job [here](https://github.com/jhuckaby/Cronicle/blob/master/docs/Plugins.md#chain-reaction-control).
504+
505+
#### Staging database checks
506+
507+
Apart from checking the logs of staging indicator run and acquisition jobs to identify potential issues with the pipeline, one can also check the contents of staging database for abnormalities.
508+
509+
At this point, acquisition job should have loaded data onto staging mysql db, specifically the `covid` database.
510+
511+
From staging:
512+
```
513+
[user@app-mono-dev-01 ~]$ mysql -u user -p
514+
Enter password:
515+
Welcome to the MySQL monitor. Commands end with ; or \g.
516+
Your MySQL connection id is 00000
517+
Server version: 8.0.36-28 Percona Server (GPL), Release 28, Revision 47601f19
518+
519+
Copyright (c) 2009-2024 Percona LLC and/or its affiliates
520+
Copyright (c) 2000, 2024, Oracle and/or its affiliates.
521+
522+
Oracle is a registered trademark of Oracle Corporation and/or its
523+
affiliates. Other names may be trademarks of their respective
524+
owners.
525+
526+
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
527+
528+
mysql> use covid;
529+
Database changed
530+
```
531+
Check `signal_dim` table to see if new source and signal names are all present and reasonable. For example:
532+
```
533+
mysql> select * from signal_dim where source='nssp';
534+
+---------------+--------+----------------------------------+
535+
| signal_key_id | source | signal |
536+
+---------------+--------+----------------------------------+
537+
| 817 | nssp | pct_ed_visits_combined |
538+
| 818 | nssp | pct_ed_visits_covid |
539+
| 819 | nssp | pct_ed_visits_influenza |
540+
| 820 | nssp | pct_ed_visits_rsv |
541+
| 821 | nssp | smoothed_pct_ed_visits_combined |
542+
| 822 | nssp | smoothed_pct_ed_visits_covid |
543+
| 823 | nssp | smoothed_pct_ed_visits_influenza |
544+
| 824 | nssp | smoothed_pct_ed_visits_rsv |
545+
+---------------+--------+----------------------------------+
546+
```
547+
548+
Then, check if the number of records ingested in db matches with the number of rows in csv when running locally.
549+
For example, the below query sets the `issue` date being the day acquisition job was run, and `signal_key_id` correspond with signals from our new source.
550+
Check if this count matches with local run result.
551+
552+
```
553+
mysql> SELECT count(*) FROM epimetric_full WHERE issue=202425 AND signal_key_id > 816 AND signal_key_id < 825;
554+
+----------+
555+
| count(*) |
556+
+----------+
557+
| 2620872 |
558+
+----------+
559+
1 row in set (0.80 sec)
560+
```
561+
562+
You can also check how data looks more specifically at each geo level or among different signal names depending on the quirks of the source.
563+
564+
See [@korlaxxalrok](https://www.github.com/korlaxxalrok) or [@minhkhul](https://www.github.com/minhkhul) for more information.
565+
566+
If everything goes well make a prod version of the indicator run job and use that to run indicator on a daily basis.
437567

438568
### Signal Documentation
439569

0 commit comments

Comments
 (0)