Skip to content

Commit 7d85e14

Browse files
authored
Merge pull request #1484 from cmu-delphi/Tina_youtube-survey_doc
Creates Youtube-survey doc page
2 parents d3c2951 + d790345 commit 7d85e14

File tree

2 files changed

+177
-1
lines changed

2 files changed

+177
-1
lines changed

docs/api/covidcast-signals/fb-survey.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,8 @@ our survey, this means a and (b or c).
126126

127127
COVID-like illness or CLI is not a standard indicator. Through our discussions
128128
with the CDC, we chose to define it as: fever along with cough or shortness of
129-
breath or difficulty breathing.
129+
breath or difficulty breathing. From the list of symptoms from Q1 on
130+
our survey, this means a and (c or d or e).
130131

131132
Symptoms alone are not sufficient to diagnose influenza or coronavirus
132133
infections, and so these ILI and CLI indicators are *not* expected to be
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
---
2+
title: Youtube Survey
3+
parent: Inactive Signals
4+
grand_parent: COVIDcast Main Endpoint
5+
---
6+
7+
[//]: # (code at https://github.com/cmu-delphi/covid-19/tree/deeb4dc1e9a30622b415361ef6b99198e77d2a94/youtube)
8+
9+
# Youtube Survey
10+
{: .no_toc}
11+
12+
* **Source name:** `youtube-survey`
13+
* **Earliest issue available:** May 01, 2020
14+
* **Number of data revisions since May 19, 2020:** 0
15+
* **Date of last change:** Never
16+
* **Available for:** state (see [geography coding docs](../covidcast_geography.md))
17+
* **Time type:** day (see [date format docs](../covidcast_times.md))
18+
* **License:** [CC BY-NC](../covidcast_licensing.md#creative-commons-attribution-noncommercial)
19+
20+
## Overview
21+
22+
This data source is based on a short survey about COVID-19-like illness
23+
run by the Delphi group at Carnegie Mellon.
24+
[Youtube directed](https://9to5google.com/2020/04/29/google-covid-19-cmu-research-survey/)
25+
a random sample of its users to these surveys, which were
26+
voluntary. Users age 18 or older were eligible to complete the surveys, and
27+
their survey responses are held by CMU. No individual survey responses are
28+
shared back to Youtube.
29+
30+
This survey was a pared-down version of the
31+
[COVID-19 Trends and Impact Survey (CTIS)](../../symptom-survey/),
32+
collecting data only about COVID-19 symptoms. CTIS is much longer-running
33+
and more detailed, also collecting belief and behavior data. CTIS also reports
34+
demographic-corrected versions of some metrics. See our
35+
[surveys page](https://delphi.cmu.edu/covid19/ctis/) for more detail
36+
about how CTIS works.
37+
38+
The two surveys report some of the same metrics. While nominally the same,
39+
note that values from the same dates differ between the two surveys for
40+
[unknown reasons](#limitations).
41+
42+
As of late April 2020, the number of Youtube survey responses we received each
43+
day was 4-7 thousand. This was not enough coverage to report at finer
44+
geographic levels, so this indicator only reports at the state level. The
45+
survey ran from April 21, 2020 to June 17, 2020, collecting about 159
46+
thousand responses in the United States in that time.
47+
48+
We produce [influenza-like and COVID-like illness indicators](#ili-and-cli-indicators)
49+
based on the survey data.
50+
51+
## Table of Contents
52+
{: .no_toc .text-delta}
53+
54+
1. TOC
55+
{:toc}
56+
57+
## Survey Text and Questions
58+
59+
The survey contains the following 5 questions:
60+
61+
1. In the past 24 hours, have you or anyone in your household experienced any of the following:
62+
- (a) Fever (100 °F or higher)
63+
- (b) Sore throat
64+
- (c) Cough
65+
- (d) Shortness of breath
66+
- (e) Difficulty breathing
67+
2. How many people in your household (including yourself) are sick (fever, along with at least one other symptom from the above list)?
68+
3. How many people are there in your household (including yourself)?
69+
4. What is your current ZIP code?
70+
5. How many additional people in your local community that you know personally are sick (fever, along with at least one other symptom from the above list)?
71+
72+
73+
## ILI and CLI Indicators
74+
75+
We define COVID-like illness (fever, along with cough, or shortness of breath,
76+
or difficulty breathing) or influenza-like illness (fever, along with cough or
77+
sore throat) for use in forecasting and modeling. Using this survey data, we
78+
estimate the percentage of people (age 18 or older) who have a COVID-like
79+
illness, or influenza-like illness, in a given location, on a given day.
80+
81+
| Signals | Description |
82+
| --- | --- |
83+
| `raw_cli` and `smoothed_cli` | Estimated percentage of people with COVID-like illness <br/> **Earliest date available:** 2020-04-21 |
84+
| `raw_ili` and `smoothed_ili` | Estimated percentage of people with influenza-like illness <br/> **Earliest date available:** 2020-04-21 |
85+
86+
Influenza-like illness or ILI is a standard indicator, and is defined by the CDC
87+
as: fever along with sore throat or cough. From the list of symptoms from Q1 on
88+
our survey, this means a and (b or c).
89+
90+
COVID-like illness or CLI is not a standard indicator. Through our discussions
91+
with the CDC, we chose to define it as: fever along with cough or shortness of
92+
breath or difficulty breathing. From the list of symptoms from Q1 on
93+
our survey, this means a and (c or d or e).
94+
95+
Symptoms alone are not sufficient to diagnose influenza or coronavirus
96+
infections, and so these ILI and CLI indicators are *not* expected to be
97+
unbiased estimates of the true rate of influenza or coronavirus infections.
98+
These symptoms can be caused by many other conditions, and many true infections
99+
can be asymptomatic. Instead, we expect these indicators to be useful for
100+
comparison across the United States and across time, to determine where symptoms
101+
appear to be increasing.
102+
103+
104+
## Estimation
105+
106+
### Estimating Percent ILI and CLI
107+
108+
Estimates are calculated using the
109+
[same method as CTIS](./fb-survey#estimating-percent-ili-and-cli).
110+
However, the Youtube survey does not do weighting.
111+
112+
### Smoothing
113+
114+
The smoothed versions of all `youtube-survey` signals (with `smoothed` prefix) are
115+
calculated using seven day pooling. For example, the estimate reported for June
116+
7 in a specific geographical area is formed by
117+
collecting all surveys completed between June 1 and 7 (inclusive) and using that
118+
data in the estimation procedures described above. Because the smoothed signals combine
119+
information across multiple days, they have larger sample sizes and hence are
120+
available for more locations than the raw signals.
121+
122+
## Lag and Backfill
123+
124+
This indicator has a lag of 2 days. Reported values can be revised for one
125+
day (corresponding to a lag of 3 days), due to how we receive survey
126+
responses. However, these tend to be associated with minimal changes in
127+
value.
128+
129+
130+
## Limitations
131+
132+
When interpreting the signals above, it is important to keep in mind several
133+
limitations of this survey data.
134+
135+
* **Survey population.** People are eligible to participate in the survey if
136+
they are age 18 or older, they are currently located in the USA, and they are
137+
an active user of Youtube. The survey data does not report on children under
138+
age 18, and the Youtube adult user population may differ from the United
139+
States population generally in important ways. We don't adjust for any
140+
demographic biases.
141+
* **Non-response bias.** The survey is voluntary, and people who accept the
142+
invitation when it is presented to them on Youtube may be different from
143+
those who do not.
144+
* **Social desirability.** Previous survey research has shown that people's
145+
responses to surveys are often biased by what responses they believe are
146+
socially desirable or acceptable. For example, if it there is widespread
147+
pressure to wear masks, respondents who do *not* wear masks may feel pressured
148+
to answer that they *do*. This survey is anonymous and online, meaning we
149+
expect the social desirability effect to be smaller, but it may still be
150+
present.
151+
152+
Whenever possible, you should compare this data to other independent sources. We
153+
believe that while these biases may affect point estimates -- that is, they may
154+
bias estimates on a specific day up or down -- the biases should not change
155+
strongly over time. This means that *changes* in signals, such as increases or
156+
decreases, are likely to represent true changes in the underlying population,
157+
even if point estimates are biased.
158+
159+
### Privacy Restrictions
160+
161+
To protect respondent privacy, we discard any estimate that is based on fewer than 100 survey responses. For
162+
signals reported using a 7-day average (those beginning with `smoothed_`), this
163+
means a geographic area must have at least 100 responses in 7 days to be
164+
reported.
165+
166+
This affects some items more than others. It affects some geographic areas
167+
more than others, particularly areas with smaller populations. This affect is
168+
less pronounced with smoothed signals, since responses are pooled across a
169+
longer time period.
170+
171+
172+
## Source and Licensing
173+
174+
This indicator aggregates responses from a Delphi-run survey that is hosted on the Youtube platform.
175+
The data is licensed as [CC BY-NC](../covidcast_licensing.md#creative-commons-attribution-noncommercial).

0 commit comments

Comments
 (0)