Skip to content

Commit 8d1f6cd

Browse files
authored
Merge pull request #292 from dfarrow0/dfarrow/covid-hosp
new data source: covid hospitalization
2 parents 5d3ef10 + 549f018 commit 8d1f6cd

File tree

21 files changed

+1865
-3
lines changed

21 files changed

+1865
-3
lines changed

deploy.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,15 @@
207207
"dst": "/common/covidcast/README.md"
208208
},
209209

210+
"// acquisition - covid_hosp",
211+
{
212+
"type": "move",
213+
"src": "src/acquisition/covid_hosp/",
214+
"dst": "[[package]]/acquisition/covid_hosp/",
215+
"match": "^.*\\.(py)$",
216+
"add-header-comment": true
217+
},
218+
210219
"// run unit and coverage tests",
211220
{"type": "py3test"}
212221

docs/api/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ The parameters available for each source are documented in each linked source-sp
9696
| --- | --- | --- | --- |
9797
| [`covidcast`](covidcast.md) | COVIDCast | Delphi's COVID-19 surveillance streams. | no |
9898
| [`covidcast_meta`](covidcast_meta.md) | COVIDCast Metadata | Metadata for Delphi's COVID-19 surveillance streams. | no |
99+
| [`covid_hosp`](covid_hosp.md) | COVID-19 Hospitalization | COVID-19 Reported Patient Impact and Hospital Capacity. | no |
99100

100101
### Influenza Data
101102

docs/api/covid_hosp.md

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
title: COVID-19 Reported Patient Impact and Hospital Capacity by State Timeseries
3+
parent: Epidata API (Other Epidemics)
4+
---
5+
6+
# COVID-19 Hospitalization
7+
8+
This data source is a mirror of the "COVID-19 Reported Patient Impact and
9+
Hospital Capacity by State Timeseries" dataset provided by the US Department of
10+
Health & Human Services via healthdata.gov.
11+
12+
See the
13+
[official description at healthdata.gov](https://healthdata.gov/dataset/covid-19-reported-patient-impact-and-hospital-capacity-state-timeseries)
14+
for more information, including a
15+
[data dictionary](https://healthdata.gov/covid-19-reported-patient-impact-and-hospital-capacity-state-data-dictionary).
16+
17+
General topics not specific to any particular data source are discussed in the
18+
[API overview](README.md). Such topics include:
19+
[contributing](README.md#contributing) and [citing](README.md#citing).
20+
21+
## Metadata
22+
23+
This data source provides various measures of COVID-19 burden on patients and healthcare in the US.
24+
- Data source: [US Department of Health & Human Services](https://healthdata.gov/dataset/covid-19-reported-patient-impact-and-hospital-capacity-state-timeseries) (HHS)
25+
- Temporal Resolution: Daily, starting 2020-01-01
26+
- Spatial Resolution: US States plus DC, PR, and VI
27+
- Open access via [Open Data Commons Open Database License (ODbL)](https://opendatacommons.org/licenses/odbl/1.0/)
28+
- Versioned by Delphi according to "issue" date. New issues are expected to be released roughly weekly.
29+
30+
# The API
31+
32+
The base URL is: https://delphi.cmu.edu/epidata/api.php
33+
34+
See [this documentation](README.md) for details on specifying locations and dates.
35+
36+
## Parameters
37+
38+
### Required
39+
40+
| Parameter | Description | Type |
41+
| --- | --- | --- |
42+
| `states` | two-letter state abbreviations | `list` of states |
43+
| `dates` | dates | `list` of dates or date ranges |
44+
45+
### Optional
46+
47+
| Parameter | Description | Type |
48+
| --- | --- | --- |
49+
| `issues` | issues | `list` of "issue" dates or date ranges |
50+
51+
If `issues` is not specified, then the most recent issue is used by default.
52+
53+
## Response
54+
55+
| Field | Description | Type |
56+
| --- | --- | --- |
57+
| `result` | result code: 1 = success, 2 = too many results, -2 = no results | integer |
58+
| `epidata` | list of results | array of objects |
59+
| `epidata[].state` | state pertaining to this row | string |
60+
| `epidata[].date` | date pertaining to this row | integer |
61+
| `epidata[].issue` | the date on which the dataset containing this row was published | integer |
62+
| `epidata[].*` | see the [data dictionary](https://healthdata.gov/covid-19-reported-patient-impact-and-hospital-capacity-state-data-dictionary) | |
63+
| `message` | `success` or error message | string |
64+
65+
# Example URLs
66+
67+
### MA on 2020-05-10 (per most recent issue)
68+
https://delphi.cmu.edu/epidata/api.php?source=covid_hosp&states=MA&dates=20200510
69+
70+
```json
71+
{
72+
"result": 1,
73+
"epidata": [
74+
{
75+
"state": "MA",
76+
"issue": 20201116,
77+
"date": 20200510,
78+
"hospital_onset_covid": 53,
79+
"hospital_onset_covid_coverage": 84,
80+
"inpatient_beds": 15691,
81+
"inpatient_beds_coverage": 73,
82+
"inpatient_beds_used": 12427,
83+
"inpatient_beds_used_coverage": 83,
84+
"inpatient_beds_used_covid": 3625,
85+
"inpatient_beds_used_covid_coverage": 84,
86+
"previous_day_admission_adult_covid_confirmed": null,
87+
"previous_day_admission_adult_covid_confirmed_coverage": 0,
88+
"previous_day_admission_adult_covid_suspected": null,
89+
"previous_day_admission_adult_covid_suspected_coverage": 0,
90+
"previous_day_admission_pediatric_covid_confirmed": null,
91+
"previous_day_admission_pediatric_covid_confirmed_coverage": 0,
92+
"previous_day_admission_pediatric_covid_suspected": null,
93+
"previous_day_admission_pediatric_covid_suspected_coverage": 0,
94+
"staffed_adult_icu_bed_occupancy": null,
95+
"staffed_adult_icu_bed_occupancy_coverage": 0,
96+
"staffed_icu_adult_patients_confirmed_suspected_covid": null,
97+
"staffed_icu_adult_patients_confirmed_suspected_covid_coverage": 0,
98+
"staffed_icu_adult_patients_confirmed_covid": null,
99+
"staffed_icu_adult_patients_confirmed_covid_coverage": 0,
100+
"total_adult_patients_hosp_confirmed_suspected_covid": null,
101+
"total_adult_patients_hosp_confirmed_suspected_covid_coverage": 0,
102+
"total_adult_patients_hosp_confirmed_covid": null,
103+
"total_adult_patients_hosp_confirmed_covid_coverage": 0,
104+
"total_pediatric_patients_hosp_confirmed_suspected_covid": null,
105+
"total_pediatric_patients_hosp_confirmed_suspected_covid_coverage": 0,
106+
"total_pediatric_patients_hosp_confirmed_covid": null,
107+
"total_pediatric_patients_hosp_confirmed_covid_coverage": 0,
108+
"total_staffed_adult_icu_beds": null,
109+
"total_staffed_adult_icu_beds_coverage": 0,
110+
"inpatient_beds_utilization_coverage": 72,
111+
"inpatient_beds_utilization_numerator": 10876,
112+
"inpatient_beds_utilization_denominator": 15585,
113+
"percent_of_inpatients_with_covid_coverage": 83,
114+
"percent_of_inpatients_with_covid_numerator": 3607,
115+
"percent_of_inpatients_with_covid_denominator": 12427,
116+
"inpatient_bed_covid_utilization_coverage": 73,
117+
"inpatient_bed_covid_utilization_numerator": 3304,
118+
"inpatient_bed_covid_utilization_denominator": 15691,
119+
"adult_icu_bed_covid_utilization_coverage": null,
120+
"adult_icu_bed_covid_utilization_numerator": null,
121+
"adult_icu_bed_covid_utilization_denominator": null,
122+
"adult_icu_bed_utilization_coverage": null,
123+
"adult_icu_bed_utilization_numerator": null,
124+
"adult_icu_bed_utilization_denominator": null,
125+
"inpatient_beds_utilization": 0.6978504972730191,
126+
"percent_of_inpatients_with_covid": 0.2902550897239881,
127+
"inpatient_bed_covid_utilization": 0.21056656682174496,
128+
"adult_icu_bed_covid_utilization": null,
129+
"adult_icu_bed_utilization": null
130+
}
131+
],
132+
"message": "success"
133+
}
134+
```
135+
136+
137+
# Code Samples
138+
139+
Libraries are available for [CoffeeScript](../../src/client/delphi_epidata.coffee), [JavaScript](../../src/client/delphi_epidata.js), [Python](../../src/client/delphi_epidata.py), and [R](../../src/client/delphi_epidata.R).
140+
The following sample shows how to import the library and fetch MA on 2020-05-10
141+
(per most recent issue).
142+
143+
### Python
144+
145+
Optionally install the package using pip(env):
146+
````bash
147+
pip install delphi-epidata
148+
````
149+
150+
Otherwise, place `delphi_epidata.py` from this repo next to your python script.
151+
152+
````python
153+
# Import
154+
from delphi_epidata import Epidata
155+
# Fetch data
156+
res = Epidata.covid_hosp('MA', 20200510)
157+
print(res['result'], res['message'], len(res['epidata']))
158+
````
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
"""Integration tests for acquisition of COVID hospitalization."""
2+
3+
# standard library
4+
from pathlib import Path
5+
import unittest
6+
from unittest.mock import MagicMock
7+
8+
# first party
9+
from delphi.epidata.acquisition.covid_hosp.database import Database
10+
from delphi.epidata.acquisition.covid_hosp.test_utils import TestUtils
11+
from delphi.epidata.client.delphi_epidata import Epidata
12+
import delphi.operations.secrets as secrets
13+
14+
# py3tester coverage target (equivalent to `import *`)
15+
__test_target__ = 'delphi.epidata.acquisition.covid_hosp.update'
16+
17+
18+
class AcquisitionTests(unittest.TestCase):
19+
20+
def setUp(self):
21+
"""Perform per-test setup."""
22+
23+
# configure test data
24+
path_to_repo_root = Path(__file__).parent.parent.parent.parent
25+
self.test_utils = TestUtils(path_to_repo_root)
26+
27+
# use the local instance of the Epidata API
28+
Epidata.BASE_URL = 'http://delphi_web_epidata/epidata/api.php'
29+
30+
# use the local instance of the epidata database
31+
secrets.db.host = 'delphi_database_epidata'
32+
secrets.db.epi = ('user', 'pass')
33+
34+
# clear relevant tables
35+
with Database.connect() as db:
36+
with db.new_cursor() as cur:
37+
cur.execute('truncate table covid_hosp')
38+
cur.execute('truncate table covid_hosp_meta')
39+
40+
def test_acquire_dataset(self):
41+
"""Acquire a new dataset."""
42+
43+
# only mock out network calls to external hosts
44+
mock_network = MagicMock()
45+
mock_network.fetch_metadata.return_value = \
46+
self.test_utils.load_sample_metadata()
47+
mock_network.fetch_dataset.return_value = \
48+
self.test_utils.load_sample_dataset()
49+
50+
# make sure the data does not yet exist
51+
with self.subTest(name='no data yet'):
52+
response = Epidata.covid_hosp('MA', Epidata.range(20200101, 20210101))
53+
self.assertEqual(response['result'], -2)
54+
55+
# acquire sample data into local database
56+
with self.subTest(name='first acquisition'):
57+
acquired = Update.run(network_impl=mock_network)
58+
self.assertTrue(acquired)
59+
60+
# make sure the data now exists
61+
with self.subTest(name='initial data checks'):
62+
response = Epidata.covid_hosp('MA', Epidata.range(20200101, 20210101))
63+
self.assertEqual(response['result'], 1)
64+
self.assertEqual(len(response['epidata']), 1)
65+
row = response['epidata'][0]
66+
self.assertEqual(row['state'], 'MA')
67+
self.assertEqual(row['date'], 20200510)
68+
self.assertEqual(row['issue'], 20201116)
69+
self.assertEqual(row['hospital_onset_covid'], 53)
70+
actual = row['inpatient_bed_covid_utilization']
71+
expected = 0.21056656682174496
72+
self.assertAlmostEqual(actual, expected)
73+
self.assertIsNone(row['adult_icu_bed_utilization'])
74+
75+
# expect 55 fields per row (56 database columns, except `id`)
76+
self.assertEqual(len(row), 55)
77+
78+
# re-acquisition of the same dataset should be a no-op
79+
with self.subTest(name='second acquisition'):
80+
acquired = Update.run(network_impl=mock_network)
81+
self.assertFalse(acquired)
82+
83+
# make sure the data still exists
84+
with self.subTest(name='final data checks'):
85+
response = Epidata.covid_hosp('MA', Epidata.range(20200101, 20210101))
86+
self.assertEqual(response['result'], 1)
87+
self.assertEqual(len(response['epidata']), 1)
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
"""Integration tests for the `covid_meta` endpoint."""
2+
3+
# standard library
4+
import unittest
5+
6+
# first party
7+
from delphi.epidata.acquisition.covid_hosp.database import Database
8+
from delphi.epidata.client.delphi_epidata import Epidata
9+
import delphi.operations.secrets as secrets
10+
11+
12+
class ServerTests(unittest.TestCase):
13+
"""Tests the `covid_meta` endpoint."""
14+
15+
def setUp(self):
16+
"""Perform per-test setup."""
17+
18+
# use the local instance of the Epidata API
19+
Epidata.BASE_URL = 'http://delphi_web_epidata/epidata/api.php'
20+
21+
# use the local instance of the epidata database
22+
secrets.db.host = 'delphi_database_epidata'
23+
secrets.db.epi = ('user', 'pass')
24+
25+
# clear relevant tables
26+
with Database.connect() as db:
27+
with db.new_cursor() as cur:
28+
cur.execute('truncate table covid_hosp')
29+
cur.execute('truncate table covid_hosp_meta')
30+
31+
def test_query_by_issue(self):
32+
"""Query with and without specifying an issue."""
33+
34+
# insert dummy data
35+
def insert_issue(cur, issue, value):
36+
so_many_nulls = ', '.join(['null'] * 51)
37+
cur.execute(f'''insert into covid_hosp values (
38+
0, {issue}, 'PA', 20201118, {value}, {so_many_nulls}
39+
)''')
40+
with Database.connect() as db:
41+
with db.new_cursor() as cur:
42+
# inserting out of order to test server-side order by
43+
insert_issue(cur, 20201201, 123)
44+
insert_issue(cur, 20201203, 789)
45+
insert_issue(cur, 20201202, 456)
46+
47+
# request without issue (defaulting to latest issue)
48+
with self.subTest(name='no issue (latest)'):
49+
response = Epidata.covid_hosp('PA', 20201118)
50+
51+
self.assertEqual(response['result'], 1)
52+
self.assertEqual(len(response['epidata']), 1)
53+
self.assertEqual(response['epidata'][0]['issue'], 20201203)
54+
self.assertEqual(response['epidata'][0]['hospital_onset_covid'], 789)
55+
56+
# request for specific issue
57+
with self.subTest(name='specific single issue'):
58+
response = Epidata.covid_hosp('PA', 20201118, issues=20201201)
59+
60+
self.assertEqual(response['result'], 1)
61+
self.assertEqual(len(response['epidata']), 1)
62+
self.assertEqual(response['epidata'][0]['issue'], 20201201)
63+
self.assertEqual(response['epidata'][0]['hospital_onset_covid'], 123)
64+
65+
# request for multiple issues
66+
with self.subTest(name='specific multiple issues'):
67+
issues = Epidata.range(20201201, 20201231)
68+
response = Epidata.covid_hosp('PA', 20201118, issues=issues)
69+
70+
self.assertEqual(response['result'], 1)
71+
self.assertEqual(len(response['epidata']), 3)
72+
rows = response['epidata']
73+
self.assertEqual(rows[0]['issue'], 20201201)
74+
self.assertEqual(rows[0]['hospital_onset_covid'], 123)
75+
self.assertEqual(rows[1]['issue'], 20201202)
76+
self.assertEqual(rows[1]['hospital_onset_covid'], 456)
77+
self.assertEqual(rows[2]['issue'], 20201203)
78+
self.assertEqual(rows[2]['hospital_onset_covid'], 789)

src/acquisition/covid_hosp/README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# COVID-19 Reported Patient Impact and Hospital Capacity by State Timeseries
2+
3+
- Data source:
4+
https://healthdata.gov/dataset/covid-19-reported-patient-impact-and-hospital-capacity-state-timeseries
5+
- Data dictionary:
6+
https://healthdata.gov/covid-19-reported-patient-impact-and-hospital-capacity-state-data-dictionary
7+
- Geographic resolution: US States plus DC, VI, and PR
8+
- Temporal resolution: daily
9+
- First date: 2020-01-01
10+
- First issue: 2020-11-16
11+
12+
# acquisition overview
13+
14+
1. Fetch the dataset's metadata in JSON format.
15+
1. If the metadata's `revision_timestamp` already appears in the database, then
16+
stop here; otherwise continue.
17+
1. Download the dataset in CSV format as determined by the metadata's `url`
18+
field.
19+
1. In a single transaction, insert the metadata and the dataset into database.

0 commit comments

Comments
 (0)