|
| 1 | +--- |
| 2 | +title: Youtube Survey |
| 3 | +parent: Inactive Signals |
| 4 | +grand_parent: COVIDcast Main Endpoint |
| 5 | +--- |
| 6 | + |
| 7 | +[//]: # (code at https://github.com/cmu-delphi/covid-19/tree/deeb4dc1e9a30622b415361ef6b99198e77d2a94/youtube) |
| 8 | + |
| 9 | +# Youtube Survey |
| 10 | +{: .no_toc} |
| 11 | + |
| 12 | +* **Source name:** `youtube-survey` |
| 13 | +* **Earliest issue available:** May 01, 2020 |
| 14 | +* **Number of data revisions since May 19, 2020:** 0 |
| 15 | +* **Date of last change:** Never |
| 16 | +* **Available for:** state (see [geography coding docs](../covidcast_geography.md)) |
| 17 | +* **Time type:** day (see [date format docs](../covidcast_times.md)) |
| 18 | +* **License:** [CC BY-NC](../covidcast_licensing.md#creative-commons-attribution-noncommercial) |
| 19 | + |
| 20 | +## Overview |
| 21 | + |
| 22 | +This data source is based on a short survey about COVID-19-like illness |
| 23 | +run by the Delphi group at Carnegie Mellon. |
| 24 | +[Youtube directed](https://9to5google.com/2020/04/29/google-covid-19-cmu-research-survey/) |
| 25 | +a random sample of its users to these surveys, which were |
| 26 | +voluntary. Users age 18 or older were eligible to complete the surveys, and |
| 27 | +their survey responses are held by CMU. No individual survey responses are |
| 28 | +shared back to Youtube. |
| 29 | + |
| 30 | +This survey was a pared-down version of the |
| 31 | +[COVID-19 Trends and Impact Survey (CTIS)](../../symptom-survey/), |
| 32 | +collecting data only about COVID-19 symptoms. CTIS is much longer-running |
| 33 | +and more detailed, also collecting belief and behavior data. CTIS also reports |
| 34 | +demographic-corrected versions of some metrics. See our |
| 35 | +[surveys page](https://delphi.cmu.edu/covid19/ctis/) for more detail |
| 36 | +about how CTIS works. |
| 37 | + |
| 38 | +The two surveys report some of the same metrics. While nominally the same, |
| 39 | +note that values from the same dates differ between the two surveys for |
| 40 | +[unknown reasons](#limitations). |
| 41 | + |
| 42 | +As of late April 2020, the number of Youtube survey responses we received each |
| 43 | +day was 4-7 thousand. This was not enough coverage to report at finer |
| 44 | +geographic levels, so this indicator only reports at the state level. The |
| 45 | +survey ran from April 21, 2020 to June 17, 2020, collecting about 159 |
| 46 | +thousand responses in the United States in that time. |
| 47 | + |
| 48 | +We produce [influenza-like and COVID-like illness indicators](#ili-and-cli-indicators) |
| 49 | +based on the survey data. |
| 50 | + |
| 51 | +## Table of Contents |
| 52 | +{: .no_toc .text-delta} |
| 53 | + |
| 54 | +1. TOC |
| 55 | +{:toc} |
| 56 | + |
| 57 | +## Survey Text and Questions |
| 58 | + |
| 59 | +The survey contains the following 5 questions: |
| 60 | + |
| 61 | +1. In the past 24 hours, have you or anyone in your household experienced any of the following: |
| 62 | + - (a) Fever (100 °F or higher) |
| 63 | + - (b) Sore throat |
| 64 | + - (c) Cough |
| 65 | + - (d) Shortness of breath |
| 66 | + - (e) Difficulty breathing |
| 67 | +2. How many people in your household (including yourself) are sick (fever, along with at least one other symptom from the above list)? |
| 68 | +3. How many people are there in your household (including yourself)? |
| 69 | +4. What is your current ZIP code? |
| 70 | +5. How many additional people in your local community that you know personally are sick (fever, along with at least one other symptom from the above list)? |
| 71 | + |
| 72 | + |
| 73 | +## ILI and CLI Indicators |
| 74 | + |
| 75 | +We define COVID-like illness (fever, along with cough, or shortness of breath, |
| 76 | +or difficulty breathing) or influenza-like illness (fever, along with cough or |
| 77 | +sore throat) for use in forecasting and modeling. Using this survey data, we |
| 78 | +estimate the percentage of people (age 18 or older) who have a COVID-like |
| 79 | +illness, or influenza-like illness, in a given location, on a given day. |
| 80 | + |
| 81 | +| Signals | Description | |
| 82 | +| --- | --- | |
| 83 | +| `raw_cli` and `smoothed_cli` | Estimated percentage of people with COVID-like illness <br/> **Earliest date available:** 2020-04-21 | |
| 84 | +| `raw_ili` and `smoothed_ili` | Estimated percentage of people with influenza-like illness <br/> **Earliest date available:** 2020-04-21 | |
| 85 | + |
| 86 | +Influenza-like illness or ILI is a standard indicator, and is defined by the CDC |
| 87 | +as: fever along with sore throat or cough. From the list of symptoms from Q1 on |
| 88 | +our survey, this means a and (b or c). |
| 89 | + |
| 90 | +COVID-like illness or CLI is not a standard indicator. Through our discussions |
| 91 | +with the CDC, we chose to define it as: fever along with cough or shortness of |
| 92 | +breath or difficulty breathing. From the list of symptoms from Q1 on |
| 93 | +our survey, this means a and (c or d or e). |
| 94 | + |
| 95 | +Symptoms alone are not sufficient to diagnose influenza or coronavirus |
| 96 | +infections, and so these ILI and CLI indicators are *not* expected to be |
| 97 | +unbiased estimates of the true rate of influenza or coronavirus infections. |
| 98 | +These symptoms can be caused by many other conditions, and many true infections |
| 99 | +can be asymptomatic. Instead, we expect these indicators to be useful for |
| 100 | +comparison across the United States and across time, to determine where symptoms |
| 101 | +appear to be increasing. |
| 102 | + |
| 103 | + |
| 104 | +## Estimation |
| 105 | + |
| 106 | +### Estimating Percent ILI and CLI |
| 107 | + |
| 108 | +Estimates are calculated using the |
| 109 | +[same method as CTIS](./fb-survey#estimating-percent-ili-and-cli). |
| 110 | +However, the Youtube survey does not do weighting. |
| 111 | + |
| 112 | +### Smoothing |
| 113 | + |
| 114 | +The smoothed versions of all `youtube-survey` signals (with `smoothed` prefix) are |
| 115 | +calculated using seven day pooling. For example, the estimate reported for June |
| 116 | +7 in a specific geographical area is formed by |
| 117 | +collecting all surveys completed between June 1 and 7 (inclusive) and using that |
| 118 | +data in the estimation procedures described above. Because the smoothed signals combine |
| 119 | +information across multiple days, they have larger sample sizes and hence are |
| 120 | +available for more locations than the raw signals. |
| 121 | + |
| 122 | +## Lag and Backfill |
| 123 | + |
| 124 | +This indicator has a lag of 2 days. Reported values can be revised for one |
| 125 | +day (corresponding to a lag of 3 days), due to how we receive survey |
| 126 | +responses. However, these tend to be associated with minimal changes in |
| 127 | +value. |
| 128 | + |
| 129 | + |
| 130 | +## Limitations |
| 131 | + |
| 132 | +When interpreting the signals above, it is important to keep in mind several |
| 133 | +limitations of this survey data. |
| 134 | + |
| 135 | +* **Survey population.** People are eligible to participate in the survey if |
| 136 | + they are age 18 or older, they are currently located in the USA, and they are |
| 137 | + an active user of Youtube. The survey data does not report on children under |
| 138 | + age 18, and the Youtube adult user population may differ from the United |
| 139 | + States population generally in important ways. We don't adjust for any |
| 140 | + demographic biases. |
| 141 | +* **Non-response bias.** The survey is voluntary, and people who accept the |
| 142 | + invitation when it is presented to them on Youtube may be different from |
| 143 | + those who do not. |
| 144 | +* **Social desirability.** Previous survey research has shown that people's |
| 145 | + responses to surveys are often biased by what responses they believe are |
| 146 | + socially desirable or acceptable. For example, if it there is widespread |
| 147 | + pressure to wear masks, respondents who do *not* wear masks may feel pressured |
| 148 | + to answer that they *do*. This survey is anonymous and online, meaning we |
| 149 | + expect the social desirability effect to be smaller, but it may still be |
| 150 | + present. |
| 151 | + |
| 152 | +Whenever possible, you should compare this data to other independent sources. We |
| 153 | +believe that while these biases may affect point estimates -- that is, they may |
| 154 | +bias estimates on a specific day up or down -- the biases should not change |
| 155 | +strongly over time. This means that *changes* in signals, such as increases or |
| 156 | +decreases, are likely to represent true changes in the underlying population, |
| 157 | +even if point estimates are biased. |
| 158 | + |
| 159 | +### Privacy Restrictions |
| 160 | + |
| 161 | +To protect respondent privacy, we discard any estimate that is based on fewer than 100 survey responses. For |
| 162 | +signals reported using a 7-day average (those beginning with `smoothed_`), this |
| 163 | +means a geographic area must have at least 100 responses in 7 days to be |
| 164 | +reported. |
| 165 | + |
| 166 | +This affects some items more than others. It affects some geographic areas |
| 167 | +more than others, particularly areas with smaller populations. This affect is |
| 168 | +less pronounced with smoothed signals, since responses are pooled across a |
| 169 | +longer time period. |
| 170 | + |
| 171 | + |
| 172 | +## Source and Licensing |
| 173 | + |
| 174 | +This indicator aggregates responses from a Delphi-run survey that is hosted on the Youtube platform. |
| 175 | +The data is licensed as [CC BY-NC](../covidcast_licensing.md#creative-commons-attribution-noncommercial). |
0 commit comments