Skip to content

get_test_data returning NA's if there are NA's in the most recent data #267

Open
@dsweber2

Description

@dsweber2

This seems like a bug. An example of what I mean:

jhu <- filter(
  case_death_rate_subset,
  time_value >= "2021-06-04",
  time_value <= "2021-12-31",
  geo_value %in% c("ca", "fl", "tx", "ny", "nj")
)
r <- epi_recipe(counts_subset) %>%
  add_role(geo_value_factor, new_role = "predictor") %>%
  step_dummy(geo_value_factor) %>%
  ## Occasionally, data reporting errors / corrections result in negative
  ## cases / deaths
  step_mutate(cases = pmax(cases, 0), deaths = pmax(deaths, 0)) %>%
  step_epi_lag(cases, deaths, lag = c(0, 7)) %>%
  step_epi_ahead(deaths, ahead = 7, role = "outcome") %>%
  step_epi_naomit() 
geo_values <-jhu$geo_value %>% unique()
one_day_nas <- tibble(
  geo_value = geo_values,
  time_value = as.Date("2022-01-01"),
  case_rate = NA,
  death_rate = runif(length(geo_values))
)
second_day_nas <- one_day_nas %>%
  mutate(time_value = as.Date("2022-01-02"))
jhu_nad <- jhu %>%
  as_tibble() %>%
  bind_rows(one_day_nas, second_day_nas) %>%
  as_epi_df()
attributes(jhu_nad)$metadata$as_of <- max(jhu_nad$time_value) + 3
get_test_data(r, jhu_nad)

The example workflow is unfortunately buried in the guts of exploration tooling; arx_forecastersort of does do the right thing, though it thinks the last day with data is the last day with NA data.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions