Skip to content

Commit 5cf2339

Browse files
committed
fix partial #8
1 parent b630f19 commit 5cf2339

File tree

14 files changed

+93
-94
lines changed

14 files changed

+93
-94
lines changed

_freeze/epipredict/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

_freeze/flatline-forecaster/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

_freeze/forecast-framework/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

_freeze/preprocessing-and-models/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

_freeze/sliding-forecasters/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

_freeze/tidymodels-intro/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

_freeze/tidymodels-regression/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

epipredict.qmd

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,21 +10,22 @@ At a high level, our goal with `{epipredict}` is to make running simple machine
1010
Serving both populations is the main motivation for our efforts, but at the same time, we have tried hard to make it useful.
1111

1212

13-
## Baseline models
13+
## Canned forecasters
1414

15-
We provide a set of basic, easy-to-use forecasters that work out of the box.
16-
You should be able to do a reasonably limited amount of customization on them. Any serious customization happens with the framework discussed below.
15+
We provide a set of basic, easy-to-use forecasters that work out of the box:
1716

18-
For the basic forecasters, we provide:
19-
2017
* Flatline (basic) forecaster
2118
* Autoregressive forecaster
2219
* Autoregressive classifier
2320
* Smooth autoregressive(AR) forecaster
2421

25-
All the forcasters we provide are built on our framework. So we will use these basic models to illustrate its flexibility.
22+
These forecasters encapsulate a series of operations (including data preprocessing, model fitting and etc.) all in instant one-liners.
23+
They are basically alternatives to each other. The main difference is the use of different models. Three forecasters use different regression models and the other one use a classification model.
24+
25+
The operations within canned forecasters all follow our uniform **framework**.
26+
Although these one-liners allow a reasonably limited amount of customization, to uncover any serious customization you need more knowledge on our framework explained in @sec-framework.
2627

27-
## Forecasting framework
28+
## Forecasting framework {#sec-framework}
2829

2930
At its core, `{epipredict}` is a **framework** for creating custom forecasters.
3031
By that we mean that we view the process of creating custom forecasters as
@@ -47,8 +48,7 @@ Therefore, if you want something from this -verse, it should "just work" (we hop
4748
The reason for the overlap is that `{workflows}` _already implements_ the first
4849
three steps. And it does this very well. However, it is missing the
4950
postprocessing stage and currently has no plans for such an implementation.
50-
And this feature is important. The baseline forecaster we provide _requires_
51-
postprocessing. Anything more complicated (which is nearly everything)
51+
And this feature is important. All forecasters need post-processing. Anything more complicated (which is nearly everything)
5252
needs this as well.
5353

5454
The second omission from `{tidymodels}` is support for panel data. Besides
@@ -64,14 +64,14 @@ into an `epi_df` as described in @sec-additional-keys.
6464

6565
## Why doesn't this package already exist?
6666

67-
- Parts of it actually DO exist. There's a universe called `tidymodels`. It
67+
- Parts of it actually DO exist. There's a universe called `{tidymodels}`. It
6868
handles pre-processing, training, and prediction, bound together, through a
69-
package called workflows. We built `epipredict` on top of that setup. In this
69+
package called workflows. We built `{epipredict}` on top of that setup. In this
7070
way, you CAN use almost everything they provide.
7171
- However, workflows doesn't do post-processing to the extent envisioned here.
72-
And nothing in `tidymodels` handles panel data.
72+
And nothing in `{tidymodels}` handles panel data.
7373
- The tidy-team doesn't have plans to do either of these things. (We checked).
74-
- There are two packages that do time series built on `tidymodels`, but it's
74+
- There are two packages that do time series built on `{tidymodels}`, but it's
7575
"basic" time series: 1-step AR models, exponential smoothing, STL decomposition,
7676
etc.[^1]
7777

@@ -101,7 +101,7 @@ out <- arx_forecaster(
101101
)
102102
```
103103

104-
This call produces a warning, which we'll ignore for now. But essentially, it's telling us that our data comes from May 2022 but we're trying to do a forecast for January 2022. The result is likely not an accurate measure of real-time forecast performance, because the data have been revised over time.
104+
This call produces a warning, which we'll ignore for now. But essentially, it's telling us that our data comes from May 2022 but we're trying to do a forecast for January 2022. The result is likely not an accurate measure of real-time forecast performance, because the data has been revised over time.
105105

106106
```{r}
107107
out
@@ -115,7 +115,7 @@ of what the predictions are for. It contains three main components:
115115
```{r}
116116
str(out$metadata)
117117
```
118-
2. The predictions in a tibble. The columns give the predictions for each location along with additional columns. By default, these are a 90% predictive interval, the `forecast_date` (the date on which the forecast was putatively made) and the `target_date` (the date for which the forecast is being made).
118+
2. The predictions in a tibble. The columns give the predictions for each location along with additional columns. By default, these are a 90% prediction interval, the `forecast_date` (the date on which the forecast was putatively made) and the `target_date` (the date for which the forecast is being made).
119119
```{r}
120120
out$predictions
121121
```
@@ -159,7 +159,7 @@ likely increase the variance of the model, and therefore, may lead to less
159159
accurate forecasts for the variable of interest.
160160

161161

162-
Another property of the basic model is the predictive interval. We describe this in more detail in a coming chapter, but it is easy to request multiple quantiles.
162+
Another property of the basic model is the prediction interval. We describe this in more detail in a coming chapter, but it is easy to request multiple quantiles.
163163

164164
```{r differential-levels}
165165
out_q <- arx_forecaster(jhu, "death_rate", c("case_rate", "death_rate"),

flatline-forecaster.qmd

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Introducing the flatline forecaster
22

3-
The flatline forecaster is a very simple forecasting model intended for `epi_df` data, where the most recent observation is used as the forecast for any future date. In other words, the last observation is propagated forward. Hence, a flat line phenomenon is observed for the point predictions. The predictive intervals are produced from the quantiles of the residuals of such a forecast over all of the training data. By default, these intervals will be obtained separately for each combination of keys (`geo_value` and any additional keys) in the `epi_df`. Thus, the output is a data frame of point (and optionally interval) forecasts at a single unique horizon (`ahead`) for each unique combination of key variables. This forecaster is comparable to the baseline used by the [COVID Forecast Hub](https://covid19forecasthub.org).
3+
The flatline forecaster is a very simple forecasting model intended for `epi_df` data, where the most recent observation is used as the forecast for any future date. In other words, the last observation is propagated forward. Hence, a flat line phenomenon is observed for the point predictions. The prediction intervals are produced from the quantiles of the residuals of such a forecast over all of the training data. By default, these intervals will be obtained separately for each combination of keys (`geo_value` and any additional keys) in the `epi_df`. Thus, the output is a data frame of point (and optionally interval) forecasts at a single unique horizon (`ahead`) for each unique combination of key variables. This forecaster is comparable to the baseline used by the [COVID Forecast Hub](https://covid19forecasthub.org).
44

55
## Example of using the flatline forecaster
66

@@ -55,8 +55,8 @@ five_days_ahead <- flatline_forecaster(
5555
five_days_ahead
5656
```
5757

58-
We could also specify that we want a 80% predictive interval by changing the
59-
levels. The default 0.05 and 0.95 levels/quantiles give us 90% predictive
58+
We could also specify that we want a 80% prediction interval by changing the
59+
levels. The default 0.05 and 0.95 levels/quantiles give us 90% prediction
6060
interval.
6161

6262
```{r}
@@ -117,15 +117,15 @@ extract_frosting(five_days_ahead$epi_workflow)
117117
```
118118

119119

120-
The post-processing operations in the order that were performed were to create the predictions and the predictive intervals, add the forecast and target dates and bound the predictions at zero.
120+
The post-processing operations in the order that were performed were to create the predictions and the prediction intervals, add the forecast and target dates and bound the predictions at zero.
121121

122122
We can also easily examine the predictions themselves.
123123

124124
```{r}
125125
five_days_ahead$predictions
126126
```
127127

128-
The results above show a distributional forecast produced using data through the end of 2021 for the January 5, 2022. A prediction for the death rate per 100K inhabitants along with a 95% predictive interval is available for every state (`geo_value`).
128+
The results above show a distributional forecast produced using data through the end of 2021 for the January 5, 2022. A prediction for the death rate per 100K inhabitants along with a 95% prediction interval is available for every state (`geo_value`).
129129

130130
The figure below displays the prediction and prediction interval for three sample states: Arizona, New York, and Florida.
131131

forecast-framework.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ er <- epi_recipe(jhu) %>%
8989
```
9090

9191
While `{recipes}` provides a function `step_lag()`, it assumes that the data
92-
have no breaks in the sequence of `time_values`. This is a bit dangerous, so
92+
has no breaks in the sequence of `time_values`. This is a bit dangerous, so
9393
we avoid that behaviour. Our `lag/ahead` functions also appropriately adjust the
9494
amount of data to avoid accidentally dropping recent predictors from the test
9595
data.

0 commit comments

Comments
 (0)