@@ -200,6 +200,87 @@ ggplot(x, aes(x = time_value, y = cases)) +
200
200
```
201
201
202
202
## Examples on Additional Keys in epi_df
203
+ In the following examples we will show how to create an ` epi_df ` with additional keys.
203
204
204
- ``` {r child = 'man/rmd/epi_df_example.Rmd'}
205
+ ### Convert a ` tsibble ` that has county code as an extra key
206
+ ``` {r}
207
+ ex1 <- tibble(
208
+ geo_value = rep(c("ca", "fl", "pa"), each = 3),
209
+ county_code = c(06059,06061,06067,
210
+ 12111,12113,12117,
211
+ 42101, 42103,42105),
212
+ time_value = rep(seq(as.Date("2020-06-01"), as.Date("2020-06-03"),
213
+ by = "day"), length.out = length(geo_value)),
214
+ value = 1:length(geo_value) + 0.01 * rnorm(length(geo_value))
215
+ ) %>%
216
+ as_tsibble(index = time_value, key = c(geo_value, county_code))
217
+
218
+ ex1 <- as_epi_df(x = ex1, geo_type = "state", time_type = "day", as_of = "2020-06-03")
219
+ ```
220
+
221
+ The metadata now includes ` county_code ` as an extra key.
222
+ ``` {r}
223
+ attr(ex1,"metadata")
224
+ ```
225
+
226
+
227
+ ### Dealing with misspecified column names
228
+
229
+ ` epi_df ` requires there to be columns ` geo_value ` and ` time_value ` , if they do not exist then ` as_epi_df() ` throws an error.
230
+ ``` {r, error = TRUE}
231
+ data.frame(
232
+ state = rep(c("ca", "fl", "pa"), each = 3), # misnamed
233
+ pol = rep(c("blue", "swing", "swing"), each = 3), # extra key
234
+ reported_date = rep(seq(as.Date("2020-06-01"), as.Date("2020-06-03"),
235
+ by = "day"), length.out = length(geo_value)), # misnamed
236
+ value = 1:length(geo_value) + 0.01 * rnorm(length(geo_value))
237
+ ) %>% as_epi_df()
205
238
```
239
+
240
+ The columns can be renamed to match ` epi_df ` format. In the example below, notice there is also an additional key ` pol ` .
241
+ ``` {r}
242
+ ex2 <- tibble(
243
+ state = rep(c("ca", "fl", "pa"), each = 3), # misnamed
244
+ pol = rep(c("blue", "swing", "swing"), each = 3), # extra key
245
+ reported_date = rep(seq(as.Date("2020-06-01"), as.Date("2020-06-03"),
246
+ by = "day"), length.out = length(state)), # misnamed
247
+ value = 1:length(state) + 0.01 * rnorm(length(state))
248
+ ) %>% data.frame()
249
+
250
+ head(ex2)
251
+
252
+ ex2 <- ex2 %>% rename(geo_value = state, time_value = reported_date) %>%
253
+ as_epi_df(geo_type = "state", as_of = "2020-06-03",
254
+ additional_metadata = c(other_keys = "pol"))
255
+
256
+ attr(ex2,"metadata")
257
+ ```
258
+
259
+
260
+ ### Adding additional keys to an ` epi_df ` object
261
+
262
+ In the above examples, all the keys are added to objects that are not ` epi_df ` objects. We illustrate how to add keys to an ` epi_df ` object.
263
+
264
+ We use a subset dataset from the the ` covidcast ` library.
265
+
266
+ ``` {r}
267
+ ex3 <- jhu_csse_county_level_subset %>%
268
+ filter(time_value > "2021-12-01", state_name == "Massachusetts") %>%
269
+ slice_tail(n = 6)
270
+
271
+ attr(ex3,"metadata") # geo_type is county currently
272
+ ```
273
+
274
+ Now we add state (MA) as a new column and a key to the metadata.
275
+ ``` {r}
276
+
277
+ ex3 <- ex3 %>%
278
+ as_tsibble() %>% # needed to add the additional metadata
279
+ mutate(state = rep("MA",6)) %>%
280
+ as_epi_df(additional_metadata = c(other_keys = "state"))
281
+
282
+ attr(ex3,"metadata")
283
+ ```
284
+
285
+
286
+
0 commit comments