Skip to content

Commit 46822ce

Browse files
authored
Merge pull request #407 from cmu-delphi/docs-survey-V2-doses
Document new V2 signal (# vaccine doses)
2 parents 9496806 + fbf7481 commit 46822ce

File tree

1 file changed

+20
-19
lines changed

1 file changed

+20
-19
lines changed

docs/api/covidcast-signals/fb-survey.md

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ $$Y_i$$ denote number of ILI and CLI cases in the household, respectively
189189
(computed according to the simple strategy described above), and let $$N_i$$
190190
denote the total number of people in the household, in survey $$i$$, out of
191191
$$m$$ surveys we collected. Then our estimates of $$p$$ and $$q$$ (see
192-
the [appendix](#appendix) for motivating details) are:
192+
the [appendix](#appendix) for motivating details) are:
193193

194194
$$
195195
\hat{p} = 100 \cdot \frac{1}{m}\sum_{i=1}^m \frac{X_i}{N_i}
@@ -236,7 +236,7 @@ b = 100 \cdot \frac{y}{n}.
236236
$$
237237

238238
We will estimate $$a$$ and $$b$$ across the same 4 aggregation schemes as
239-
before.
239+
before.
240240

241241
For a single survey, let:
242242

@@ -333,6 +333,7 @@ also available. These have names beginning `smoothed_w`, such as
333333
| `smoothed_vaccine_likely_who` | Estimated percentage of respondents who would be more likely to get a COVID-19 vaccine if it were recommended to them by the World Health Organization, among respondents who have not yet been vaccinated. <br/> **Earliest date available:** 2021-01-20 | V4 |
334334
| `smoothed_vaccine_likely_govt_health` | Estimated percentage of respondents who would be more likely to get a COVID-19 vaccine if it were recommended to them by government health officials, among respondents who have not yet been vaccinated. <br/> **Earliest date available:** 2021-01-20 | V4 |
335335
| `smoothed_vaccine_likely_politicians` | Estimated percentage of respondents who would be more likely to get a COVID-19 vaccine if it were recommended to them by politicians, among respondents who have not yet been vaccinated. <br/> **Earliest date available:** 2021-01-20 | V4 |
336+
| `smoothed_received_2_vaccine_doses` | Estimated percentage of respondents who have received two doses of a COVID-19 vaccine, among respondents who have received either one or two doses of a COVID-19 vaccine. This item was shown to respondents starting in Wave 7. <br/> **Earliest date available:** 2021-02-06 | V2 |
336337

337338
These indicators are based on questions added in Wave 6 of the survey,
338339
introduced on December 19, 2020; however, Delphi only enabled item V1 beginning
@@ -409,7 +410,7 @@ our [survey weight documentation page](../../symptom-survey/weights.md).
409410

410411
As before, for a given aggregation unit (for example, daily-county), let $$X_i$$
411412
and $$Y_i$$ denote the numbers of ILI and CLI cases in household $$i$$,
412-
respectively (computed according to the simple strategy above), and let $$N_i$$
413+
respectively (computed according to the simple strategy above), and let $$N_i$$
413414
denote the total number of people in the household. Let $$i = 1, \dots, m$$
414415
denote the surveys started during the time period of interest and reported in a
415416
ZIP code intersecting the spatial unit of interest.
@@ -424,9 +425,9 @@ population is in each county.)
424425
Let $$w^{\text{init}}_i=w^{\text{part}}_i w^{\text{geodiv}}_i$$ denote the
425426
initial weight assigned to this survey. First, we adjust these initial weights
426427
to reduce sensitivity to any individual survey by "mixing" them with a uniform
427-
weighting across all relevant surveys. This prevents specific survey respondents
428+
weighting across all relevant surveys. This prevents specific survey respondents
428429
with high survey weights having disproportionate influence on the weighted
429-
estimates.
430+
estimates.
430431

431432
Specifically, we select the smallest value of $$a \in [0.05, 1]$$ such that
432433

@@ -438,8 +439,8 @@ for all $$i$$. If such a selection is impossible, then we have insufficient
438439
survey responses (less than 100), and do not produce an estimate for the given
439440
aggregation unit.
440441

441-
Next, we rescale the weights $$w_i$$ over all $$i$$ so that $$\sum_{i=1}^m
442-
w_i=1$$. Then our adjusted estimates of $$p$$ and $$q$$ are:
442+
Next, we rescale the weights $$w_i$$ over all $$i$$ so that $$\sum_{i=1}^m
443+
w_i=1$$. Then our adjusted estimates of $$p$$ and $$q$$ are:
443444

444445
$$
445446
\begin{aligned}
@@ -503,7 +504,7 @@ and $$V_i$$ denote the indicators that the survey respondent knows someone in
503504
their community with CLI, including and not including their household,
504505
respectively, for survey $$i$$, out of $$m$$ surveys collected. Also let
505506
$$w_i$$ be the self-normalized weight that accompanies survey $$i$$, as
506-
above. Then our adjusted estimates of $$a$$ and $$b$$ are:
507+
above. Then our adjusted estimates of $$a$$ and $$b$$ are:
507508

508509
$$
509510
\begin{aligned}
@@ -531,13 +532,13 @@ importance sampling estimators.
531532
Here are some details behind the choice of estimators for [percent ILI and
532533
percent CLI](#ili-and-cli-indicators).
533534

534-
Suppose there are $$h$$ households total in the underlying population, and for
535-
household $$i$$, denote $$\theta_i=N_i/n$$. Then note that the quantities of
536-
interest, $$p$$ and $$q$$, are
535+
Suppose there are $$h$$ households total in the underlying population, and for
536+
household $$i$$, denote $$\theta_i=N_i/n$$. Then note that the quantities of
537+
interest, $$p$$ and $$q$$, are
537538

538539
$$
539540
p = \sum_{i=1}^h \frac{X_i}{N_i} \theta_i
540-
\quad\text{and}\quad
541+
\quad\text{and}\quad
541542
q = \sum_{i=1}^h \frac{Y_i}{N_i} \theta_i.
542543
$$
543544

@@ -548,17 +549,17 @@ are simply
548549

549550
$$
550551
\hat{p} = \frac{1}{m} \sum_{i \in S} \frac{X_i}{N_i}
551-
\quad\text{and}\quad
552+
\quad\text{and}\quad
552553
\hat{q} = \frac{1}{m} \sum_{i \in S} \frac{Y_i}{N_i},
553554
$$
554555

555-
which are an equivalent way of writing our previously-defined estimates.
556+
which are an equivalent way of writing our previously-defined estimates.
556557

557558
Note that we can again rewrite our quantities of interest as
558559

559560
$$
560-
p = \frac{\mu_x}{\mu_n}
561-
\quad\text{and}\quad
561+
p = \frac{\mu_x}{\mu_n}
562+
\quad\text{and}\quad
562563
q = \frac{\mu_y}{\mu_n},
563564
$$
564565

@@ -570,11 +571,11 @@ denotes the total number of households in the population.
570571
Suppose that instead of proportional sampling, we sampled households uniformly,
571572
resulting in $$S \subseteq \{1,\dots,h\}$$ denote sampled households, with
572573
$$m=|S|$$. Then the natural estimates of $$p$$ and $$q$$ are instead plug-in
573-
estimates of the numerators and denominators in the above,
574+
estimates of the numerators and denominators in the above,
574575

575576
$$
576577
\tilde{p} = \frac{\bar{X}}{\bar{N}}
577-
\quad\text{and}\quad
578+
\quad\text{and}\quad
578579
\tilde{q} = \frac{\bar{X}}{\bar{N}}
579580
$$
580581

@@ -597,7 +598,7 @@ evidence:
597598
household: individuals 18 years or older, who have a Facebook account. Hence
598599
if we posit that the number of "Facebook adults" scales linearly with the
599600
household size, which seems to us like a reasonable assumption, then sampling
600-
would still be proportional to household size. (Notice that this would
601+
would still be proportional to household size. (Notice that this would
601602
remain true no matter how small the linear coefficient is, that is, it would
602603
even be true if Facebook did not have good coverage over the US.)
603604

0 commit comments

Comments
 (0)