Skip to content

Commit c94120f

Browse files
authored
Merge pull request #503 from dshemetov/fix_safegraph_docs
Docs update: move safegraph to inactive category
2 parents 638fca7 + 2c13c41 commit c94120f

File tree

4 files changed

+92
-61
lines changed

4 files changed

+92
-61
lines changed

docs/api/covidcast-signals/indicator-combination.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ calculated or composed by Delphi. It is not a primary data source.
2020

2121
## Statistical Combination Signals (Inactive)
2222

23-
The NMF combination signals were deactivated on March 17, 2021. Documenation for
23+
The NMF combination signals were deactivated on March 17, 2021. Documentation for
2424
these signals is still available on the page for [inactive indicator-combination
2525
signals](indicator-combination-inactive.md).
2626

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
---
2+
title: SafeGraph
3+
parent: Inactive Signals
4+
grand_parent: COVIDcast Epidata API
5+
---
6+
7+
# SafeGraph
8+
{: .no_toc}
9+
* **Source name:** `safegraph`
10+
* **Available for:** county, MSA, HRR, state (see [geography coding docs](../covidcast_geography.md))
11+
* **Time type:** day (see [date format docs](../covidcast_times.md))
12+
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)
13+
14+
This data source uses data reported by [SafeGraph](https://www.safegraph.com/)
15+
using anonymized location data from mobile phones. SafeGraph provides several
16+
different datasets to eligible researchers. We surface signals from two such
17+
datasets. This dataset is no longer updated after April 19th, 2021.
18+
19+
## Table of Contents
20+
{: .no_toc .text-delta}
21+
22+
1. TOC
23+
{:toc}
24+
25+
## SafeGraph Social Distancing Metrics
26+
27+
* **Earliest issue available:** June 20, 2020
28+
* **Number of data revisions since June 23, 2020:** 1
29+
* **Date of last change:** November 3, 2020
30+
31+
Data source based on [social distancing
32+
metrics](https://docs.safegraph.com/docs/social-distancing-metrics). SafeGraph
33+
provides this data for individual census block groups, using differential
34+
privacy to protect individual people's data privacy.
35+
36+
Delphi creates features of the SafeGraph data at the census block group level,
37+
then aggregates these features to the county and state levels. The aggregated
38+
data is freely available through the COVIDcast API.
39+
40+
For precise definitions of the quantities below, consult the [SafeGraph social
41+
distancing metric
42+
documentation](https://docs.safegraph.com/docs/social-distancing-metrics).
43+
44+
| Signal | Description |
45+
| --- | --- |
46+
| `completely_home_prop` | The fraction of mobile devices that did not leave the immediate area of their home (SafeGraph's `completely_home_device_count / device_count`) <br/> **Earliest date available:** 01/01/2019 |
47+
| `full_time_work_prop` | The fraction of mobile devices that spent more than 6 hours at a location other than their home during the daytime (SafeGraph's `full_time_work_behavior_devices / device_count`) <br/> **Earliest date available:** 01/01/2019 |
48+
| `part_time_work_prop` | The fraction of devices that spent between 3 and 6 hours at a location other than their home during the daytime (SafeGraph's `part_time_work_behavior_devices / device_count`) <br/> **Earliest date available:** 01/01/2019 |
49+
| `median_home_dwell_time` | The median time spent at home for all devices at this location for this time period, in minutes <br/> **Earliest date available:** 01/01/2019 |
50+
| `completely_home_prop_7dav` | Offers a 7-day trailing window average of the `completely_home_prop`. <br/> **Earliest date available:** 01/01/2019 |
51+
| `full_time_work_prop_7dav` | Offers a 7-day trailing window average of the`full_time_work_prop`. <br/> **Earliest date available:** 01/01/2019 |
52+
| `part_time_work_prop_7dav` | Offers a 7-day trailing window average of the`part_time_work_prop`. <br/> **Earliest date available:** 01/01/2019 |
53+
| `median_home_dwell_time_7dav` | Offers a 7-day trailing window average of the `median_home_dwell_time`. <br/> **Earliest date available:** 01/01/2019 |
54+
55+
After computing each metric on the census block group (CBG) level, we aggregate
56+
to the county-level by taking the mean over CBGs in a county to obtain the value
57+
and taking `sd / sqrt(n)` for the standard error, where `sd` is the standard
58+
deviation over the metric values and `n` is the number of CBGs in the county. In
59+
doing so, we make the simplifying assumption that each CBG contributes an iid
60+
observation to the county-level distribution. `n` also serves as the sample
61+
size. The same method is used for aggregation to states.
62+
63+
SafeGraph's signals measure mobility each day, which causes strong day-of-week
64+
effects: weekends have substantially different values than weekdays. Users
65+
interested in long-term trends, rather than mobility on one specific day, may
66+
prefer the `7dav` signals since averaging over the preceding 7 days removes
67+
these day-of-week effects.
68+
69+
### Lag
70+
71+
SafeGraph provides this data with a three-day lag, meaning estimates for a
72+
specific day are only available three days later. It may take up to an
73+
additional day for SafeGraph's data to be ingested into the COVIDcast API.
74+

docs/api/covidcast-signals/safegraph.md

Lines changed: 7 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -13,65 +13,19 @@ grand_parent: COVIDcast Epidata API
1313

1414
This data source uses data reported by [SafeGraph](https://www.safegraph.com/)
1515
using anonymized location data from mobile phones. SafeGraph provides several
16-
different datasets to eligible researchers. We surface signals from two such
17-
datasets.
16+
different datasets to eligible researchers. We currently surface signals from one such
17+
dataset.
1818

19-
## Table of Contents
19+
## Table of contents
2020
{: .no_toc .text-delta}
2121

2222
1. TOC
2323
{:toc}
2424

25-
## SafeGraph Social Distancing Metrics
26-
27-
* **Earliest issue available:** June 20, 2020
28-
* **Number of data revisions since June 23, 2020:** 1
29-
* **Date of last change:** November 3, 2020
30-
31-
Data source based on [social distancing
32-
metrics](https://docs.safegraph.com/docs/social-distancing-metrics). SafeGraph
33-
provides this data for individual census block groups, using differential
34-
privacy to protect individual people's data privacy.
35-
36-
Delphi creates features of the SafeGraph data at the census block group level,
37-
then aggregates these features to the county and state levels. The aggregated
38-
data is freely available through the COVIDcast API.
39-
40-
For precise definitions of the quantities below, consult the [SafeGraph social
41-
distancing metric
42-
documentation](https://docs.safegraph.com/docs/social-distancing-metrics).
43-
44-
| Signal | Description |
45-
| --- | --- |
46-
| `completely_home_prop` | The fraction of mobile devices that did not leave the immediate area of their home (SafeGraph's `completely_home_device_count / device_count`) <br/> **Earliest date available:** 01/01/2019 |
47-
| `full_time_work_prop` | The fraction of mobile devices that spent more than 6 hours at a location other than their home during the daytime (SafeGraph's `full_time_work_behavior_devices / device_count`) <br/> **Earliest date available:** 01/01/2019 |
48-
| `part_time_work_prop` | The fraction of devices that spent between 3 and 6 hours at a location other than their home during the daytime (SafeGraph's `part_time_work_behavior_devices / device_count`) <br/> **Earliest date available:** 01/01/2019 |
49-
| `median_home_dwell_time` | The median time spent at home for all devices at this location for this time period, in minutes <br/> **Earliest date available:** 01/01/2019 |
50-
| `completely_home_prop_7dav` | Offers a 7-day trailing window average of the `completely_home_prop`. <br/> **Earliest date available:** 01/01/2019 |
51-
| `full_time_work_prop_7dav` | Offers a 7-day trailing window average of the`full_time_work_prop`. <br/> **Earliest date available:** 01/01/2019 |
52-
| `part_time_work_prop_7dav` | Offers a 7-day trailing window average of the`part_time_work_prop`. <br/> **Earliest date available:** 01/01/2019 |
53-
| `median_home_dwell_time_7dav` | Offers a 7-day trailing window average of the `median_home_dwell_time`. <br/> **Earliest date available:** 01/01/2019 |
54-
55-
After computing each metric on the census block group (CBG) level, we aggregate
56-
to the county-level by taking the mean over CBGs in a county to obtain the value
57-
and taking `sd / sqrt(n)` for the standard error, where `sd` is the standard
58-
deviation over the metric values and `n` is the number of CBGs in the county. In
59-
doing so, we make the simplifying assumption that each CBG contributes an iid
60-
observation to the county-level distribution. `n` also serves as the sample
61-
size. The same method is used for aggregation to states.
62-
63-
SafeGraph's signals measure mobility each day, which causes strong day-of-week
64-
effects: weekends have substantially different values than weekdays. Users
65-
interested in long-term trends, rather than mobility on one specific day, may
66-
prefer the `7dav` signals since averaging over the preceding 7 days removes
67-
these day-of-week effects.
68-
69-
### Lag
70-
71-
SafeGraph provides this data with a three-day lag, meaning estimates for a
72-
specific day are only available three days later. It may take up to an
73-
additional day for SafeGraph's data to be ingested into the COVIDcast API.
25+
## SafeGraph Social Distancing Metrics (Inactive)
7426

27+
These signals were updated until April 19th, 2021, when Safegraph ceased updating the dataset.
28+
Documentation for these signals is still available on the [inactive Safegraph page](safegraph-inactive.md).
7529

7630
## SafeGraph Weekly Patterns
7731

@@ -91,7 +45,7 @@ restaurants, etc.) from SafeGraph's Weekly Patterns data at the 5-digit ZipCode
9145
level, then aggregates and reports these features to the county, MSA, HRR, and
9246
state levels. The aggregated data is freely available through the COVIDcast API.
9347

94-
For precise definitions of the quantities below, consult the [SafeGraph Weekly
48+
For precise definitions of the quantities below, consult the [SafeGraph Weekly
9549
Patterns documentation](https://docs.safegraph.com/docs/weekly-patterns).
9650

9751
| Signal | Description |
@@ -131,4 +85,3 @@ SafeGraph's Social Distancing Metrics and Weekly Patterns are based on mobile de
13185

13286
* **Geographic bias.** If some regions have a greater density of SafeGraph panel members as a percentage of the population than other regions, comparisons of metrics between regions may be biased. Regions with more SafeGraph panel members will appear to have more visits counted, even if the rate of visits in the general population is the same.
13387
* **Demographic bias.** SafeGraph panels may not be representative of the local population as a whole. For example, [some research suggests](https://arxiv.org/abs/2011.07194) that "older and non-white voters are less likely to be captured by mobility data", so this data will not accurately reflect behavior in those populations. Since population demographics vary across the United States, this can also contribute to geographic biases.
134-

docs/api/covidcast_changelog.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,13 @@ Duplicate survey weights had corrupted historical figures for the following sign
4343
* `hrr`: 20200406-20200415, 20200430-20200506
4444
* `msa`: 20200408-20200414, 20200430-20200506
4545
* `state`: 20200408-20200416, 20200430-20200506
46-
47-
#### 20 November 2020
4846

49-
Due to a bug in our data processing system, estimates of the percentage of people reporting that they have been tested for COVID-19 calculated before October 8th were incorrect.
50-
We incorrectly treated an answer of “no” as a missing response, which affected the `smoothed_tested_14d` and `smoothed_wtested_14d` signals from the `fb-survey` source.
47+
#### 20 November 2020
5148

52-
As of Nov. 20th, the error has been corrected and all affected data reissued.
49+
Due to a bug in our data processing system, estimates of the percentage of people reporting that they have been tested for COVID-19 calculated before October 8th were incorrect.
50+
We incorrectly treated an answer of “no” as a missing response, which affected the `smoothed_tested_14d` and `smoothed_wtested_14d` signals from the `fb-survey` source.
51+
52+
As of Nov. 20th, the error has been corrected and all affected data reissued.
5353

5454
### `hospital-admissions`
5555
#### 20 October 2020
@@ -63,7 +63,7 @@ We now include figures on Puerto Rico for all `jhu-csse` signals at the state le
6363

6464
#### 1 September 2020
6565

66-
NY Boroughs county FIPS (36005, 36047, 36061, 36081, 36085) are now split in proportion to the population of each county, instead of being reported in aggregate in FIPS 36061.
66+
NY Boroughs county FIPS (36005, 36047, 36061, 36081, 36085) are now split in proportion to the population of each county, instead of being reported in aggregate in FIPS 36061.
6767

6868
#### 7 October 2020
6969

@@ -95,6 +95,10 @@ We went from a custom geo mapping file (for aggregating from zip->(county, msa,
9595

9696
### `safegraph`
9797

98+
#### 19 April 2021
99+
100+
The Safegraph social distancing metrics are no longer being updated. Weekly patterns are still available.
101+
98102
#### 3 November 2020
99103
We went from a custom geo mapping file (for aggregating from county->state) to a central geo file based on rigorously sourced US census data.
100104

0 commit comments

Comments
 (0)