Skip to content

Commit 17e1d05

Browse files
committed
Improve fb-survey documentation
Add limitations, mention it's age 18+, link to dashboard.
1 parent 0e59e6b commit 17e1d05

File tree

1 file changed

+59
-9
lines changed

1 file changed

+59
-9
lines changed

docs/api/covidcast-signals/fb-survey.md

Lines changed: 59 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,11 @@ grand_parent: COVIDcast Epidata API
1919

2020
This data source is based on symptom surveys run by the Delphi group at Carnegie
2121
Mellon. Facebook directs a random sample of its users to these surveys, which
22-
are voluntary. Individual survey responses are held by CMU and are sharable with
23-
other health researchers under a data use agreement. No individual survey
24-
responses are shared back to Facebook. See our [surveys
25-
page](https://covidcast.cmu.edu/surveys.html) for more detail about how the
22+
are voluntary. Users age 18 or older are eligible to complete the surveys, and
23+
their survey responses are held by CMU and are sharable with other health
24+
researchers under a data use agreement. No individual survey responses are
25+
shared back to Facebook. See our [surveys
26+
page](https://delphi.cmu.edu/covidcast/surveys/) for more detail about how the
2627
surveys work and how they are used outside the COVIDcast API.
2728

2829
We produce several sets of signals based on the survey data, listed and
@@ -39,6 +40,10 @@ described in the sections below:
3940
4. [Mental health indicators](#mental-health-indicators), based on self-reports
4041
of anxiety, depression, isolation, and worry about COVID
4142

43+
Many of these signals can also be browsed on our [survey
44+
dashboard](https://delphi.cmu.edu/covidcast/survey-results/) at any selected
45+
location.
46+
4247
## Table of Contents
4348
{: .no_toc .text-delta}
4449

@@ -74,17 +79,18 @@ Researchers can [request
7479
access](https://dataforgood.fb.com/docs/covid-19-symptom-survey-request-for-data-access/)
7580
to (fully de-identified) individual survey responses for research purposes.
7681

77-
As of mid-August 2020, the average number of Facebook survey responses we
78-
receive each day is about 74,000, and the total number of survey responses we
79-
have received is over 9 million.
82+
As of early March 2021, the average number of Facebook survey responses we
83+
receive each day is about 40,000, and the total number of survey responses we
84+
have received is over 17 million.
8085

8186
## ILI and CLI Indicators
8287

8388
Of primary interest for the API are the symptoms defining a COVID-like illness
8489
(fever, along with cough, or shortness of breath, or difficulty breathing) or
8590
influenza-like illness (fever, along with cough or sore throat). Using this
86-
survey data, we estimate the percentage of people who have a COVID-like illness,
87-
or influenza-like illness, in a given location, on a given day.
91+
survey data, we estimate the percentage of people (age 18 or older) who have a
92+
COVID-like illness, or influenza-like illness, in a given location, on a given
93+
day.
8894

8995
| Signals | Description |
9096
| --- | --- |
@@ -396,6 +402,50 @@ below](#survey-weighting) to be more representative of state demographics, are
396402
also available. These have names beginning `smoothed_w`, such as
397403
`smoothed_wdepressed_14d`.
398404

405+
## Limitations
406+
407+
When interpreting the signals above, it is important to keep in mind several
408+
limitations of this survey data.
409+
410+
* **Survey population.** People are eligible to participate in the survey if
411+
they are age 18 or older and they are an active user of Facebook. The survey
412+
data does not report on children under age 18, and the Facebook adult user
413+
population may differ from the United States population generally in important
414+
ways. We use our [survey weighting](#survey-weighting) to adjust the estimates
415+
to match age and gender demographics by state, but this process doesn't adjust
416+
for other demographic biases we may not be aware of.
417+
* **Non-response bias.** The survey is voluntary, and people who accept the
418+
invitation when it is presented to them on Facebook may be different from
419+
those who do not. The [survey weights provided by Facebook](#survey-weighting)
420+
attempt to model the probability of response for each user and hence adjust
421+
for this, but it is difficult to tell if these weights account for all
422+
possible non-response bias.
423+
* **Social desirability.** Previous survey research has shown that people's
424+
responses to surveys are often biased by what responses they believe are
425+
socially desirable or acceptable. For example, if it there is widespread
426+
pressure to wear masks, respondents who do *not* wear masks may feel pressured
427+
to answer that they *do*. This survey is anonymous and online, meaning we
428+
expect the social desirability effect to be smaller, but it may still be
429+
present.
430+
* **False responses.** As with anything on the Internet, a small percentage of
431+
users give deliberately incorrect responses. We discard a small number of
432+
responses that are obviously false, but do not perform extensive filtering.
433+
However, the large size of the study, and our procedure for ensuring that each
434+
respondent can only be counted once when they are invited to take the survey,
435+
prevents individual respondents from having a large effect on results.
436+
* **Repeat invitations.** Individual respondents can be invited by Facebook to
437+
take the survey several times. Usually Facebook only re-invites a respondent
438+
after one month. Hence estimates of values on a single day are calculated
439+
using independent survey responses from unique respondents (or, at least,
440+
unique Facebook accounts), whereas estimates from different months may involve
441+
the same respondents.
442+
443+
Whenever possible, you should compare this data to other independent sources. We
444+
believe that while these biases may affect point estimates -- that is, they may
445+
bias estimates on a specific day up or down -- the biases should not change
446+
strongly over time. This means that *changes* in signals, such as increases or
447+
decreases, are likely to represent true changes in the underlying population,
448+
even if point estimates are biased.
399449

400450
## Survey Weighting
401451

0 commit comments

Comments
 (0)