@@ -19,10 +19,11 @@ grand_parent: COVIDcast Epidata API
19
19
20
20
This data source is based on symptom surveys run by the Delphi group at Carnegie
21
21
Mellon. Facebook directs a random sample of its users to these surveys, which
22
- are voluntary. Individual survey responses are held by CMU and are sharable with
23
- other health researchers under a data use agreement. No individual survey
24
- responses are shared back to Facebook. See our [ surveys
25
- page] ( https://covidcast.cmu.edu/surveys.html ) for more detail about how the
22
+ are voluntary. Users age 18 or older are eligible to complete the surveys, and
23
+ their survey responses are held by CMU and are sharable with other health
24
+ researchers under a data use agreement. No individual survey responses are
25
+ shared back to Facebook. See our [ surveys
26
+ page] ( https://delphi.cmu.edu/covidcast/surveys/ ) for more detail about how the
26
27
surveys work and how they are used outside the COVIDcast API.
27
28
28
29
We produce several sets of signals based on the survey data, listed and
@@ -39,6 +40,13 @@ described in the sections below:
39
40
4 . [ Mental health indicators] ( #mental-health-indicators ) , based on self-reports
40
41
of anxiety, depression, isolation, and worry about COVID
41
42
43
+ Many of these signals can also be browsed on our [ survey
44
+ dashboard] ( https://delphi.cmu.edu/covidcast/survey-results/ ) at any selected
45
+ location.
46
+
47
+ Additionally, contingency tables containing demographic breakdowns of survey
48
+ data are [ also available for download] ( ../../symptom-survey/contingency-tables.md ) .
49
+
42
50
## Table of Contents
43
51
{: .no_toc .text-delta}
44
52
@@ -74,17 +82,18 @@ Researchers can [request
74
82
access] ( https://dataforgood.fb.com/docs/covid-19-symptom-survey-request-for-data-access/ )
75
83
to (fully de-identified) individual survey responses for research purposes.
76
84
77
- As of mid-August 2020 , the average number of Facebook survey responses we
78
- receive each day is about 74 ,000, and the total number of survey responses we
79
- have received is over 9 million.
85
+ As of early March 2021 , the average number of Facebook survey responses we
86
+ receive each day is about 40 ,000, and the total number of survey responses we
87
+ have received is over 17 million.
80
88
81
89
## ILI and CLI Indicators
82
90
83
91
Of primary interest for the API are the symptoms defining a COVID-like illness
84
92
(fever, along with cough, or shortness of breath, or difficulty breathing) or
85
93
influenza-like illness (fever, along with cough or sore throat). Using this
86
- survey data, we estimate the percentage of people who have a COVID-like illness,
87
- or influenza-like illness, in a given location, on a given day.
94
+ survey data, we estimate the percentage of people (age 18 or older) who have a
95
+ COVID-like illness, or influenza-like illness, in a given location, on a given
96
+ day.
88
97
89
98
| Signals | Description |
90
99
| --- | --- |
@@ -396,6 +405,50 @@ below](#survey-weighting) to be more representative of state demographics, are
396
405
also available. These have names beginning ` smoothed_w ` , such as
397
406
` smoothed_wdepressed_14d ` .
398
407
408
+ ## Limitations
409
+
410
+ When interpreting the signals above, it is important to keep in mind several
411
+ limitations of this survey data.
412
+
413
+ * ** Survey population.** People are eligible to participate in the survey if
414
+ they are age 18 or older, they are currently located in the USA, and they are an active user of Facebook. The survey
415
+ data does not report on children under age 18, and the Facebook adult user
416
+ population may differ from the United States population generally in important
417
+ ways. We use our [ survey weighting] ( #survey-weighting ) to adjust the estimates
418
+ to match age and gender demographics by state, but this process doesn't adjust
419
+ for other demographic biases we may not be aware of.
420
+ * ** Non-response bias.** The survey is voluntary, and people who accept the
421
+ invitation when it is presented to them on Facebook may be different from
422
+ those who do not. The [ survey weights provided by Facebook] ( #survey-weighting )
423
+ attempt to model the probability of response for each user and hence adjust
424
+ for this, but it is difficult to tell if these weights account for all
425
+ possible non-response bias.
426
+ * ** Social desirability.** Previous survey research has shown that people's
427
+ responses to surveys are often biased by what responses they believe are
428
+ socially desirable or acceptable. For example, if it there is widespread
429
+ pressure to wear masks, respondents who do * not* wear masks may feel pressured
430
+ to answer that they * do* . This survey is anonymous and online, meaning we
431
+ expect the social desirability effect to be smaller, but it may still be
432
+ present.
433
+ * ** False responses.** As with anything on the Internet, a small percentage of
434
+ users give deliberately incorrect responses. We discard a small number of
435
+ responses that are obviously false, but do not perform extensive filtering.
436
+ However, the large size of the study, and our procedure for ensuring that each
437
+ respondent can only be counted once when they are invited to take the survey,
438
+ prevents individual respondents from having a large effect on results.
439
+ * ** Repeat invitations.** Individual respondents can be invited by Facebook to
440
+ take the survey several times. Usually Facebook only re-invites a respondent
441
+ after one month. Hence estimates of values on a single day are calculated
442
+ using independent survey responses from unique respondents (or, at least,
443
+ unique Facebook accounts), whereas estimates from different months may involve
444
+ the same respondents.
445
+
446
+ Whenever possible, you should compare this data to other independent sources. We
447
+ believe that while these biases may affect point estimates -- that is, they may
448
+ bias estimates on a specific day up or down -- the biases should not change
449
+ strongly over time. This means that * changes* in signals, such as increases or
450
+ decreases, are likely to represent true changes in the underlying population,
451
+ even if point estimates are biased.
399
452
400
453
## Survey Weighting
401
454
0 commit comments