You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: epipredict.qmd
+16-16Lines changed: 16 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -10,21 +10,22 @@ At a high level, our goal with `{epipredict}` is to make running simple machine
10
10
Serving both populations is the main motivation for our efforts, but at the same time, we have tried hard to make it useful.
11
11
12
12
13
-
## Baseline models
13
+
## Canned forecasters
14
14
15
-
We provide a set of basic, easy-to-use forecasters that work out of the box.
16
-
You should be able to do a reasonably limited amount of customization on them. Any serious customization happens with the framework discussed below.
15
+
We provide a set of basic, easy-to-use forecasters that work out of the box:
17
16
18
-
For the basic forecasters, we provide:
19
-
20
17
* Flatline (basic) forecaster
21
18
* Autoregressive forecaster
22
19
* Autoregressive classifier
23
20
* Smooth autoregressive(AR) forecaster
24
21
25
-
All the forcasters we provide are built on our framework. So we will use these basic models to illustrate its flexibility.
22
+
These forecasters encapsulate a series of operations (including data preprocessing, model fitting and etc.) all in instant one-liners.
23
+
They are basically alternatives to each other. The main difference is the use of different models. Three forecasters use different regression models and the other one use a classification model.
24
+
25
+
The operations within canned forecasters all follow our uniform **framework**.
26
+
Although these one-liners allow a reasonably limited amount of customization, to uncover any serious customization you need more knowledge on our framework explained in @sec-framework.
26
27
27
-
## Forecasting framework
28
+
## Forecasting framework {#sec-framework}
28
29
29
30
At its core, `{epipredict}` is a **framework** for creating custom forecasters.
30
31
By that we mean that we view the process of creating custom forecasters as
@@ -47,8 +48,7 @@ Therefore, if you want something from this -verse, it should "just work" (we hop
47
48
The reason for the overlap is that `{workflows}`_already implements_ the first
48
49
three steps. And it does this very well. However, it is missing the
49
50
postprocessing stage and currently has no plans for such an implementation.
50
-
And this feature is important. The baseline forecaster we provide _requires_
51
-
postprocessing. Anything more complicated (which is nearly everything)
51
+
And this feature is important. All forecasters need post-processing. Anything more complicated (which is nearly everything)
52
52
needs this as well.
53
53
54
54
The second omission from `{tidymodels}` is support for panel data. Besides
@@ -64,14 +64,14 @@ into an `epi_df` as described in @sec-additional-keys.
64
64
65
65
## Why doesn't this package already exist?
66
66
67
-
- Parts of it actually DO exist. There's a universe called `tidymodels`. It
67
+
- Parts of it actually DO exist. There's a universe called `{tidymodels}`. It
68
68
handles pre-processing, training, and prediction, bound together, through a
69
-
package called workflows. We built `epipredict` on top of that setup. In this
69
+
package called workflows. We built `{epipredict}` on top of that setup. In this
70
70
way, you CAN use almost everything they provide.
71
71
- However, workflows doesn't do post-processing to the extent envisioned here.
72
-
And nothing in `tidymodels` handles panel data.
72
+
And nothing in `{tidymodels}` handles panel data.
73
73
- The tidy-team doesn't have plans to do either of these things. (We checked).
74
-
- There are two packages that do time series built on `tidymodels`, but it's
74
+
- There are two packages that do time series built on `{tidymodels}`, but it's
75
75
"basic" time series: 1-step AR models, exponential smoothing, STL decomposition,
76
76
etc.[^1]
77
77
@@ -101,7 +101,7 @@ out <- arx_forecaster(
101
101
)
102
102
```
103
103
104
-
This call produces a warning, which we'll ignore for now. But essentially, it's telling us that our data comes from May 2022 but we're trying to do a forecast for January 2022. The result is likely not an accurate measure of real-time forecast performance, because the data have been revised over time.
104
+
This call produces a warning, which we'll ignore for now. But essentially, it's telling us that our data comes from May 2022 but we're trying to do a forecast for January 2022. The result is likely not an accurate measure of real-time forecast performance, because the data has been revised over time.
105
105
106
106
```{r}
107
107
out
@@ -115,7 +115,7 @@ of what the predictions are for. It contains three main components:
115
115
```{r}
116
116
str(out$metadata)
117
117
```
118
-
2. The predictions in a tibble. The columns give the predictions for each location along with additional columns. By default, these are a 90% predictive interval, the `forecast_date` (the date on which the forecast was putatively made) and the `target_date` (the date for which the forecast is being made).
118
+
2. The predictions in a tibble. The columns give the predictions for each location along with additional columns. By default, these are a 90% prediction interval, the `forecast_date` (the date on which the forecast was putatively made) and the `target_date` (the date for which the forecast is being made).
119
119
```{r}
120
120
out$predictions
121
121
```
@@ -159,7 +159,7 @@ likely increase the variance of the model, and therefore, may lead to less
159
159
accurate forecasts for the variable of interest.
160
160
161
161
162
-
Another property of the basic model is the predictive interval. We describe this in more detail in a coming chapter, but it is easy to request multiple quantiles.
162
+
Another property of the basic model is the prediction interval. We describe this in more detail in a coming chapter, but it is easy to request multiple quantiles.
Copy file name to clipboardExpand all lines: flatline-forecaster.qmd
+5-5Lines changed: 5 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Introducing the flatline forecaster
2
2
3
-
The flatline forecaster is a very simple forecasting model intended for `epi_df` data, where the most recent observation is used as the forecast for any future date. In other words, the last observation is propagated forward. Hence, a flat line phenomenon is observed for the point predictions. The predictive intervals are produced from the quantiles of the residuals of such a forecast over all of the training data. By default, these intervals will be obtained separately for each combination of keys (`geo_value` and any additional keys) in the `epi_df`. Thus, the output is a data frame of point (and optionally interval) forecasts at a single unique horizon (`ahead`) for each unique combination of key variables. This forecaster is comparable to the baseline used by the [COVID Forecast Hub](https://covid19forecasthub.org).
3
+
The flatline forecaster is a very simple forecasting model intended for `epi_df` data, where the most recent observation is used as the forecast for any future date. In other words, the last observation is propagated forward. Hence, a flat line phenomenon is observed for the point predictions. The prediction intervals are produced from the quantiles of the residuals of such a forecast over all of the training data. By default, these intervals will be obtained separately for each combination of keys (`geo_value` and any additional keys) in the `epi_df`. Thus, the output is a data frame of point (and optionally interval) forecasts at a single unique horizon (`ahead`) for each unique combination of key variables. This forecaster is comparable to the baseline used by the [COVID Forecast Hub](https://covid19forecasthub.org).
The post-processing operations in the order that were performed were to create the predictions and the predictive intervals, add the forecast and target dates and bound the predictions at zero.
120
+
The post-processing operations in the order that were performed were to create the predictions and the prediction intervals, add the forecast and target dates and bound the predictions at zero.
121
121
122
122
We can also easily examine the predictions themselves.
123
123
124
124
```{r}
125
125
five_days_ahead$predictions
126
126
```
127
127
128
-
The results above show a distributional forecast produced using data through the end of 2021 for the January 5, 2022. A prediction for the death rate per 100K inhabitants along with a 95% predictive interval is available for every state (`geo_value`).
128
+
The results above show a distributional forecast produced using data through the end of 2021 for the January 5, 2022. A prediction for the death rate per 100K inhabitants along with a 95% prediction interval is available for every state (`geo_value`).
129
129
130
130
The figure below displays the prediction and prediction interval for three sample states: Arizona, New York, and Florida.
0 commit comments