Skip to content

Commit 0bd0f76

Browse files
authored
Update CTIS documentation to point to ICPSR (#1615)
Now that the CTIS data is archived with ICPSR, we can point there and deprecate pages describing access to microdata via CMU. ICPSR also archives the contingency tables, including a user guide, so we can remove the redundant contingency table documentation and point to ICPSR instead.
1 parent db9c6a9 commit 0bd0f76

File tree

7 files changed

+92
-222
lines changed

7 files changed

+92
-222
lines changed

docs/symptom-survey/collaboration-revision.md

+5
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,15 @@
22
title: Collaboration and Survey Revision
33
parent: <i>inactive</i> COVID-19 Trends and Impact Survey
44
nav_order: 1
5+
nav_exclude: true
56
---
67

78
# Collaboration and Survey Revision
89

10+
<div style="background-color:#f5f6fa; padding: 10px
11+
30px;"><strong>Update:</strong> CTIS data collection has ended. We are no longer
12+
revising the survey or hosting collaboration meetings.</div>
13+
914
Delphi continues to revise the COVID-19 Trends and Impact Survey (CTIS)
1015
instruments in order to prioritize items that have the greatest utility for the
1116
response to the COVID-19 pandemic. We conduct revisions in collaboration with

docs/symptom-survey/contingency-tables.md

+19-159
Original file line numberDiff line numberDiff line change
@@ -8,176 +8,36 @@ nav_order: 4
88
{: .no_toc}
99

1010
This documentation describes the fine-resolution contingency tables produced by
11-
grouping [US COVID-19 Trends and Impact Survey (CTIS)](./index.md) individual responses by various
12-
self-reported demographic features.
11+
grouping [US COVID-19 Trends and Impact Survey (CTIS)](./index.md) individual
12+
responses by various self-reported demographic features. The contingency tables
13+
are publicly available for download as a complete set from the Inter-university
14+
Consortium for Political Science Research (ICPSR):
1315

14-
* [Weekly files](https://www.cmu.edu/delphi-web/surveys/weekly-rollup/)
15-
* [Monthly files](https://www.cmu.edu/delphi-web/surveys/monthly-rollup/)
16+
* Reinhart, Alex, Mejia, Robin, and Tibshirani, Ryan J. COVID-19 Trends and
17+
Impact Survey (CTIS), United States, 2020-2022. Inter-university Consortium
18+
for Political and Social Research [distributor], 2025-02-28.
19+
<https://doi.org/10.3886/ICPSR39207.v1>
20+
21+
Select the dataset "DS0 Study-Level Files" to download the complete set of
22+
contingency tables and all survey documentation files, including the codebooks
23+
and an Aggregate Contingency Table User Guide that describes the data
24+
processing and file formats, and includes example R code.
1625

1726
These contingency tables provide granular breakdowns of COVID-related topics
1827
such as vaccine uptake and acceptance. Compatible tables are also available for
1928
the [UMD Global CTIS](https://covidmap.umd.edu/) for more than 100 countries and
20-
territories worldwide, through [UMD's
21-
website](https://covidmap.umd.edu/umdcsvs/Contingency_Tables/).
29+
territories worldwide, also [through
30+
ICPSR](https://www.icpsr.umich.edu/web/ICPSR/studies/39206).
2231

23-
These tables are more detailed than the [coarse aggregates reported in the COVIDcast Epidata API](../api/covidcast-signals/fb-survey.md), which are grouped
32+
These tables are more detailed than the [coarse aggregates reported in the
33+
COVIDcast Epidata API](../api/covidcast-signals/fb-survey.md), which are grouped
2434
only by geographic region. [Individual response data](survey-files.md) for the
25-
survey is available, but only to academic or nonprofit researchers who sign a
26-
Data Use Agreement, whereas these contingency tables are available to the
27-
general public.
35+
survey is available, but only to researchers who request restricted data access
36+
via ICPSR, whereas these contingency tables are available to the general public.
2837

2938
Please see our survey [credits](index.md#credits) and [citation information](index.md#citing-the-survey)
3039
for information on how to cite this data if you use it in a publication.
3140

3241
Our [Data and Sampling Errors](problems.md) documentation lists important
3342
updates for data users, including corrections to data or updates on data
3443
processing delays.
35-
36-
## Table of contents
37-
{: .no_toc .text-delta}
38-
39-
1. TOC
40-
{:toc}
41-
42-
## Available Data
43-
44-
We currently provide data files at several levels of geographic and temporal
45-
aggregation. The reason for this is that each file is filtered to only include
46-
estimates for a particular group if that group includes 100 or more responses.
47-
Providing several levels of granularity allows us to provide coverage for a
48-
variety of use cases. For example, users who need the most up-to-date data or
49-
are interested in time series analysis should use the weekly files, while
50-
those who want to study groups with smaller sample sizes should use the
51-
monthly files. Because monthly aggregates include more responses, they have
52-
lower missingness when grouping by several features at a time.
53-
54-
* [Weekly files](https://www.cmu.edu/delphi-web/surveys/weekly-rollup/)
55-
* [Monthly files](https://www.cmu.edu/delphi-web/surveys/monthly-rollup/)
56-
57-
Files contain all time periods for a given period type-aggregation
58-
type combination.
59-
60-
Individual CSVs containing a single [week](https://www.cmu.edu/delphi-web/surveys/weekly/) or [month](https://www.cmu.edu/delphi-web/surveys/monthly/) for a given aggregation type are also available.
61-
62-
### Dates
63-
64-
The included files provide estimates for various metrics of interest over a
65-
period of either a full epiweek (or [MMWR
66-
week](https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf), a
67-
standardized numbering of weeks throughout the year) or a full calendar month.
68-
69-
Note: If a survey item was introduced in the middle of an aggregation period,
70-
derived indicators will be included in aggregations for that period but will
71-
only use a partial week or month of data.
72-
73-
### Regions
74-
75-
At the moment, only nation-wide and state groupings are available.
76-
77-
Facebook only invites users to take the survey if they appear, based on
78-
attributes in their Facebook profiles, to reside in the 50 states or
79-
Washington, DC. Puerto Rico is sampled separately as part of the
80-
[international version of the survey](https://covidmap.umd.edu/). If Facebook
81-
believes a user qualifies for the survey, but the user then replies that they
82-
live in Puerto Rico or another US territory, we do not include their response
83-
in the aggregations.
84-
85-
### Privacy
86-
87-
The aggregates are filtered to only include estimates for a particular group
88-
if that group includes 100 or more responses. Especially in the weekly
89-
aggregates, many of the state-level groups have been filtered out due to low
90-
sample size. In such cases, files that group by a single demographic of
91-
interest will likely provide more coverage.
92-
93-
## File Format
94-
95-
### Naming
96-
97-
"Rollup" files containing all time periods for a given period type-aggregation
98-
type combination have names of the form:
99-
100-
{period_type}_{geo_type}_{aggregation_type}.csv.gz
101-
102-
Unless noted otherwise, the time period is always a complete month
103-
(`period_type` = `monthly`) or epiweek (`period_type` = `weekly`). `geo_type` is
104-
the geographic level responses were aggregated over. `aggregation_type` is a
105-
concatenated list of other grouping variables used, ordered alphabetically.
106-
Values for variables used in file naming align with those within files as
107-
specified in the column section below.
108-
109-
### Columns
110-
111-
Within a CSV, the first few columns store metadata of the aggregation:
112-
113-
| Column | Description |
114-
| --- | --- |
115-
| `survey_geo` | Survey geography ("US") |
116-
| `period_start` | Date (yyyyMMdd) of first day of time period used in aggregation, in the Pacific time zone (UTC - 7) |
117-
| `period_end` | Date of last day of time period used in aggregation |
118-
| `period_val` | Month or week number |
119-
| `geo_type` | Geography type ("state", "nation") |
120-
| `aggregation_type` | Concatenated list of grouping variables, ordered alphabetically |
121-
| `country` | Country name ("United States") |
122-
| `ISO_3` | Three-letter ISO country code ("USA") |
123-
| `GID_0` | GADM level 0 ID |
124-
| `state` | State name; "Overall" if aggregation not grouped at the state level |
125-
| `GID_1` | GADM level 1 ID |
126-
| `state_fips` | State FIPS code; `NA` if aggregation not grouped at the state level |
127-
| `county` | County name; "Overall" if aggregation not grouped at the county level |
128-
| `county_fips` | County FIPS code; `NA` if aggregation not grouped at the county level |
129-
| `issue_date` | Date on which estimates were generated |
130-
131-
These are followed by the grouping variables used in the aggregation, ordered
132-
alphabetically, and the indicators. Each indicator reports four columns
133-
(unrounded):
134-
135-
* `val_<indicator name>`: the main value of interest, e.g., percent, average, or
136-
count, estimated using the [survey weights](weights.md) to better match state
137-
demographics
138-
* `se_<indicator name>`: the standard error of `val_<indicator name>`
139-
* `sample_size_<indicator name>`: the number of survey responses used to
140-
calculate `val_<indicator name>`
141-
* `represented_<indicator name>`: the number of people in the population that
142-
`val_<indicator name>` represents over all days in the given time period. This
143-
is the sum of [survey weights](./weights.md) for all survey responses
144-
used.
145-
146-
All aggregates using the same set of grouping variables appear in a single CSV.
147-
148-
### Missing Values
149-
150-
Grouping variables (including region) will be missing (`NA`) to represent
151-
respondents who provided one or more responses to survey items used for
152-
indicators (e.g., vaccine uptake) but who did not provide a response to the
153-
survey item used for the particular grouping variable. For example, if
154-
grouping by gender, we would report the groups: male, female, other, and `NA`,
155-
respondents who did not provide a response to the gender question.
156-
157-
For a given respondent group (25-34 year old healthcare workers in Nebraska,
158-
e.g.) sample size can vary by indicator because of the survey display logic.
159-
For example, all respondents are asked if they have received a COVID-19
160-
vaccination (item V1), but only those who say they *have* are asked how many
161-
doses they have received (item V2). This means that the sample size for V2 is
162-
smaller than that for V1. Because indicators are [censored](#privacy)
163-
individually, it is possible that V1-derived indicators will be reported for a
164-
given group while V2-derived indicators are not. In this case, the V2-derived
165-
indicator columns will be marked as missing (`NA`) for that group.
166-
167-
## Indicators
168-
169-
<div style="background-color:#f5f6fa; padding: 10px 30px;"><strong>Indicator
170-
codebook:</strong> Our <a href="contingency-codebook.csv">contingency table
171-
codebook (CSV)</a> lists all indicators available in the US contingency tables
172-
for download, and specifies the survey questions on which they are based. See
173-
the <a href="coding.html">survey instrument codebook</a> for the full text of
174-
all questions.</div>
175-
176-
The files contain [weighted estimates](../api/covidcast-signals/fb-survey.md#survey-weighting-and-estimation)
177-
of the percent of respondents who fulfill one or several criteria. Estimates are
178-
broken out by state, age, gender, race, ethnicity, occupation, and health
179-
conditions.
180-
181-
We plan to expand the list of indicators based on research needs; if you have a
182-
public health or research need for a particular variable not included in our
183-
current tables please contact us at <[email protected]>.

docs/symptom-survey/data-access.md

+17-27
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,20 @@ characteristics are available for download.
2020
## Getting Microdata Access
2121

2222
De-identified individual survey responses can be made available to researchers
23-
associated with universities or non-profit organizations who sign a Data Use
24-
Agreement (DUA). To request access to the data please submit the information
25-
requested in [Facebook's page on obtaining data access](https://dataforgood.facebook.com/dfg/docs/covid-19-trends-and-impact-survey-request-for-data-access),
26-
which sets out the basic conditions and provides a form to request access. An
27-
[international version of CTIS](https://covidmap.umd.edu/) is conducted by the
28-
University of Maryland (UMD) and access can be requested through the same
29-
form.
23+
associated with universities or non-profit organizations who agree to a Data Use
24+
Agreement (DUA). The microdata is archived by the Inter-university Consortium
25+
for Political and Social Research (ICPSR) at the University of Michigan:
26+
27+
* Reinhart, Alex, Mejia, Robin, and Tibshirani, Ryan J. COVID-19 Trends and
28+
Impact Survey (CTIS), United States, 2020-2022. Inter-university Consortium
29+
for Political and Social Research [distributor], 2025-02-28.
30+
<https://doi.org/10.3886/ICPSR39207.v1>
31+
32+
Follow the link to view the data description and documentation, and to request
33+
access to the restricted microdata. The survey documentation, including full
34+
codebooks and user guides, is available for public download. Microdata access is
35+
no longer available through direct agreements with Carnegie Mellon University,
36+
so all access must be requested through ICPSR.
3037

3138
The United States survey protocol has been reviewed by the Carnegie Mellon
3239
University Institutional Review Board with IRB ID STUDY2020_00000162.
@@ -44,26 +51,9 @@ Some important notes about obtaining access to the individual survey responses:
4451
* Part- or full-time employees of Facebook are **not** eligible to receive data
4552
access, since Delphi's agreement with Facebook to protect the privacy of
4653
respondents prohibits Facebook employees from receiving any microdata.
47-
* Because this survey is large and many groups have access, the Data Use
48-
Agreements are not negotiable.
49-
50-
After you complete the request form, staff from Facebook and CMU will be in
51-
contact to guide you through the rest of the process. They will provide data use
52-
agreements for your institution to sign, and will also request a copy of your
53-
Institutional Review Board approval to verify you have ethical approval to
54-
conduct the research.
55-
56-
After the DUAs are executed, we will ask you to fill out [this
57-
form](http://cmu.ca1.qualtrics.com/jfe/form/SV_89aVsYl29Oay4qq) to set up your
58-
microdata access. This form can be used for new research projects or adding new
59-
researchers to existing projects.
60-
61-
After completing these forms, credentials for SFTP will be emailed to each
62-
individual on the team. Please **do not share your credentials** with other
63-
users. Only one person per research team needs to fill out this survey. You can
64-
list all relevant team members in one submission. For teams with more than 5
65-
members, please fill out an additional form(s) to cover your whole team.
6654

6755
If you have questions about the process, or your IRB needs information
6856
about the survey for their review, contact us at
69-
57+
<[email protected]>. For all questions about ICPSR's
58+
restricted data access process, contact ICPSR through the forms or email
59+
addresses on their website.

docs/symptom-survey/end-of-survey.md

+1-5
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,7 @@ continue to [request access](./data-access.md) to non-public, non-aggregated
5757
survey data for their research, and current approved data users will be able to
5858
continue accessing the non-aggregated data until their current data use
5959
agreements (DUA) expire. Researchers currently holding a fully executed DUA will
60-
have the option to extend their DUA after it expires. Though no new data will be
61-
collected after June 25, 2022, [Meta’s CTIS
62-
visualizations](https://dataforgood.facebook.com/covid-survey/) will continue to
63-
be available, and until the end of 2022, [JH CCP’s COVID Behaviors
64-
dashboard](https://covidbehaviors.org/) will as well.
60+
have the option to extend their DUA after it expires.
6561

6662

6763
## CTIS Impact

0 commit comments

Comments
 (0)