Skip to content

Commit adce1f9

Browse files
committed
doc: update report site template
1 parent 1422683 commit adce1f9

File tree

1 file changed

+156
-115
lines changed

1 file changed

+156
-115
lines changed

reports/template.md

+156-115
Original file line numberDiff line numberDiff line change
@@ -1,161 +1,202 @@
1-
# Forecast Reports
1+
<style>
2+
/* Some basic styling (a reasonable reading width and dark mode support) */
3+
body {
4+
max-width: 800px;
5+
margin: 2rem auto;
6+
padding: 0 1rem;
7+
font-family: sans-serif;
8+
background: white;
9+
color: black;
10+
}
11+
12+
a:link {
13+
color: blue;
14+
}
15+
16+
a:visited {
17+
color: purple;
18+
}
19+
20+
/* Dark mode support */
21+
@media (prefers-color-scheme: dark) {
22+
body {
23+
background: #121212;
24+
color: #e0e0e0;
25+
}
26+
27+
a:link {
28+
color: #80cbc4;
29+
}
30+
31+
a:visited {
32+
color: #b39ddb; /* light purple for contrast on dark bg */
33+
}
34+
}
35+
</style>
36+
37+
# Delphi Forecast Reports
238

339
[GitHub Repo](https://github.com/cmu-delphi/explorationt-tooling/)
440

5-
## Production Reports
41+
## Overview
642

43+
- The weekly fanplots were used by the team for visual inspections of the forecasts.
44+
- The season reports provide a general analysis of the season's data and forecasts performance.
45+
- The backtesting reports provide a detailed analysis of a wide variety of forecasters' performance on the previous season's data.
46+
- A description of the forecaster families explored is provided at the bottom of the page.
747

8-
### Scoring this season
48+
## Weekly Fanplots 2024-2025 Season
949

1050

11-
## Exploration Reports
51+
## 2024-2025 Season Reports
1252

53+
- [Season Summary](season_summary_2025.html) (the notebooks below are linked from here)
54+
- [Covid's Problematic Initial Forecast](first_day_wrong.html)
55+
- [NHSN Revision Behavior](revision_summary_2025.html)
1356
- [An Analysis of Decreasing Behavior in Forecasters](decreasing_forecasters.html)
1457
- [NHSN 2024-2025 Data Analysis](new_data.html)
1558

16-
### Flu
59+
## Backtesting on 2023-2024 Season
60+
61+
- [Exploration Summary](exploration_summary_2024.html)
62+
- Flu
63+
- All forecasters population scale their data, use geo pooling, and train using quantreg.
64+
- These definitions are in the `flu_forecaster_config.R` file.
65+
- [Flu Overall](flu-overall-notebook.html)
66+
- [Flu AR](flu-notebook-scaled_pop_main.html)
67+
- [Flu AR with augmented data](flu-notebook-scaled_pop_data_augmented.html)
68+
- [Flu AR with exogenous features](flu-notebook-scaled_pop_exogenous.html)
69+
- [Flu AR with different seasonal schemes](flu-notebook-scaled_pop_season.html)
70+
- [Flu AR with augmented data and with different seasonal window sizes](flu-notebook-season_window_sizes.html)
71+
- [Flu AR with augmented data, exogenous features, and seasonal windowing](flu-notebook-scaled_pop_season_exogenous.html)
72+
- Simplistic/low data methods:
73+
- [Flu no recent](flu-notebook-no_recent_quant.html)
74+
- [Flu no recent](flu-notebook-no_recent_quant.html)
75+
- [Flu flatline](flu-notebook-flatline.html)
76+
- [Flu climate](flu-notebook-climate_linear.html)
77+
- Covid
78+
- All forecasters population scale their data, use geo pooling, and train using quantreg.
79+
- These definitions are in the `covid_forecaster_config.R` file.
80+
- [Covid AR](covid-notebook-scaled_pop_main.html)
81+
- [Covid AR with seasonal features](covid-notebook-scaled_pop_season.html)
82+
- [Covid AR with exogenous features](covid-notebook-scaled_pop_exogenous.html)
83+
- [Covid Flatline](covid-notebook-flatline_forecaster.html)
84+
- Simplistic/low data methods:
85+
- [Covid no recent](covid-notebook-no_recent_quant.html)
86+
- [Covid flatline](covid-notebook-flatline.html)
87+
- [Covid climate](covid-notebook-climate_linear.html)
88+
89+
## Description of Forecaster Families
90+
91+
The main forecaster families were:
92+
93+
- Autoregressive models (AR)
94+
- with seasonal features
95+
- with exogenous features
96+
- with augmented data
97+
- Climatological
98+
- Linear trend
99+
- No recent outcome
100+
- Flatline
101+
102+
All the AR models had the option of population scaling, seasonal features, exogenous features, and augmented data.
103+
We tried all possible combinations of these features.
104+
All models had the option of using the `linreg`, `quantreg`, or `grf` engine.
105+
We found that `quantreg` gave better results than `linreg` and we had computational issues with `grf`, so we used `quantreg` the rest of the time.
106+
107+
### Autoregressive models (AR)
17108

18-
All forecasters population scale their data, use geo pooling, and train using quantreg.
19-
These definitions are in the `flu_forecaster_config.R` file.
20-
21-
- [Flu Overall](flu-overall-notebook.html)
22-
- [Flu AR](flu-notebook-scaled_pop_main.html)
23-
- [Flu AR with augmented data](flu-notebook-scaled_pop_data_augmented.html)
24-
- [Flu AR with exogenous features](flu-notebook-scaled_pop_exogenous.html)
25-
- [Flu AR with different seasonal schemes](flu-notebook-scaled_pop_season.html)
26-
- [Flu AR with augmented data and with different seasonal window sizes](flu-notebook-season_window_sizes.html)
27-
- [Flu AR with augmented data, exogenous features, and seasonal windowing](flu-notebook-scaled_pop_season_exogenous.html)
28-
29-
Simplistic/low data methods:
30-
31-
- [Flu no recent](flu-notebook-no_recent_quant.html)
32-
- [Flu flatline](flu-notebook-flatline.html)
33-
- [Flu climate](flu-notebook-climate_linear.html)
34-
35-
### Covid
36-
37-
All forecasters population scale their data, use geo pooling, and train using quantreg.
38-
These definitions are in the `covid_forecaster_config.R` file.
39-
40-
- [Covid AR](covid-notebook-scaled_pop_main.html)
41-
- [Covid AR with seasonal features](covid-notebook-scaled_pop_season.html)
42-
- [Covid AR with exogenous features](covid-notebook-scaled_pop_exogenous.html)
43-
- [Covid Flatline](covid-notebook-flatline_forecaster.html)
44-
45-
Simplistic/low data methods:
46-
47-
- [Covid no recent](covid-notebook-no_recent_quant.html)
48-
- [Covid flatline](covid-notebook-flatline.html)
49-
- [Covid climate](covid-notebook-climate_linear.html)
50-
51-
## Descriptions of Forecaster Families
52-
53-
### Training Data Information
54-
55-
(Taken from [David's Org File](https://github.com/cmu-delphi/exploration-tooling/blob/5a6da8d0d0202da6d79a5ee8e702d4654364ce46/forecasters_description.org#flusion).)
56-
57-
Some use just NHSN, while others use historical data from ILI+ and Flusurv+ as
58-
additional rows in training. ILI+ and Flusurv+ have been adjusted so that the
59-
total for the season matches NHSN’s total. Flusurv is taken from epidata, but
60-
ILI+ was constructed by Evan and given to Richard. The testing date range is
61-
roughly the 2023 season, so October 2023 through late April 2024.
62-
63-
### Flu exogenous features
109+
Internal name: `scaled_pop`.
64110

65-
- NSSP
66-
Note that this data set is possibly cheating, as we don't have revisions before April of this year, so it is using the latest data.
67-
If we narrow down to `time_value`s after that, the revision behavior is
111+
A simple autoregressive model, which predicts using
68112

69-
```
70-
Min lag (time to first version):
71-
min median mean max
72-
7 days 7 days 7.7 days 14 days
73-
Fraction of epi_key+time_values with
74-
No revisions:
75-
• 362 out of 954 (37.95%)
76-
Quick revisions (last revision within 3 days of the `time_value`):
77-
• 0 out of 954 (0%)
78-
Few revisions (At most 3 revisions for that `time_value`):
79-
• 946 out of 954 (99.16%)
113+
$$x_{t+k} = ar(x)$$
80114

81-
Fraction of revised epi_key+time_values which have:
82-
Less than 0.1 spread in relative value:
83-
• 329 out of 592 (55.57%)
84-
Spread of more than 0.1015 in actual value (when revised):
85-
• 18 out of 592 (3.04%)
86-
days until within 20% of the latest value:
87-
min median mean max
88-
7 days 7 days 9 days 70 days
89-
```
115+
where $x$ is the target variable and $ar(x)$ is a linear combination of the target variable's past values, which can be scaled according to each state's population or whitened according to another scheme (or both). In practice, we found that using lags (0, 7) was quite effective (with (0, 7, 14) and (0, 7, 14, 21) providing no discernible advantage), so we focused on those, so in practice our model was
90116

91-
So most days have some revisioning, but with fairly small total changes, with the vast majority of days being within 20% of their eventual value within a week (with some much longer exceptions, apparently).
92-
So the impact of the cheating is likely small but of course hard to know.
117+
$$x_{t+k} = x_t + x_{t-7}$$
93118

94-
- Google-Symptoms
95-
This dataset doesn't have revisions, but has a history of suddenly disappearing.
96-
The latest value was used to simulate actually having the data; at worst, it breaks down to being the underlying forecaster.
97-
- NWSS and NWSS_regional
98-
The originating dataset has minimal revisions, but as this is a dataset with quite a lot of processing from the underlying that involves some amount of time travel, it is unclear how much revision behavior it effectively has.
119+
where $k \in \{0, 7, 14, 21, 28\}$ is the forecast horizon.
99120

100-
### Data Whitening
121+
### Autoregressive models with seasonal features
101122

102-
The data augmented models using ILI+ and FluSurv+ take a few different approaches to data whitening, depending on the `scale_method, center_method, nonlin_method` parameters.
123+
Internal name: `scaled_pop_seasonal`.
103124

104-
TODO: Add descriptions.
125+
We tried a few different attempts at incorporating seasonal features:
105126

106-
This is more closely in line with the [RobustScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html#sklearn.preprocessing.RobustScaler) from scikit-learn (using a much wider quantile than the default settings there).
127+
- The approach that performed the best was using a *seasonal training window* that grabbed a window of data (about 4 weeks before and ahead) around the forecast epiweek from the current and previous seasons.
128+
- Two *indicator variables* that roughly correspond to before, during, and after the typical peak (roughly, `before = season_week < 16`, `during = 16 <= season_week <= 20`, and `after = season_week > 20`).
129+
- Taking the first two *principal components* of the full whitened augmented data reshaped as `(epiweek, state_source_season_value)`.
130+
(We found that this was not particularly effective, so we did not use it.
131+
Despite spending a week debugging this, we could not rule out the possibility that it was a bug.
132+
However, we also had mixed results from tests of this feature in very simple synthetic data cases.)
133+
- We also tried using the *climatological median* of the target variable as a feature (see below for definition of "climatological").
134+
- Note that unusually, the last two features are actually led rather than lagged, since we should be predicting using the target's coefficient, rather than the present one.
107135

108-
## Overall comparison
136+
### Autoregressive models with exogenous features
109137

110-
This takes the best mean WIS result from each of the forecaster families below, and puts them in the same notebook for inter-family comparison.
138+
Internal name: `scaled_pop_seasonal`.
111139

112-
## Forecaster Families
140+
These models could opt into the same seasonal features as the `scaled_pop_seasonal` forecaster, but also included exogenous features.
113141

114-
### AR with population scaling
142+
#### Flu exogenous features
115143

116-
Internal name: `scaled_pop`.
144+
- NSSP - we don't have revisions before Spring 2024 for this data, so we used a revision analysis from the data collected after that date to estimate the lag (roughly 7 days) and used that lag to simulate delays.
145+
- Google-Symptoms - this dataset doesn't have revisions, but has a history of suddenly disappearing, resulting in intermittent long update lags.
146+
We did not simulate a lag and just used to latest value for a best case scenario.
147+
The symptom set used was s01, s03, and s04 from [here](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/google-symptoms.html).
148+
- NWSS - the originating dataset has minimal revisions, but as this is a dataset with quite a lot of processing from the underlying that involves some amount of time travel, so it is unclear how much revision behavior is present.
149+
- NWSS_regional - same as NWSS, just aggregated to the HHS region level.
117150

118-
A simple model, which predicts using
151+
#### Covid exogenous features
119152

120-
$$x_{t+k} = ar(x)$$
153+
- NSSP - same as flu.
154+
- Google-Symptoms - same as flu, though we used a slightly different symtom set (just s04 and s05 from [here](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/google-symptoms.html)).
121155

122-
where $x$ is scaled according to each state’s population.
156+
### Autoregressive models with augmented data
123157

124-
Three versions, two with different engines `quantreg` and `grf`, and the final one with augmented data.
158+
Internal name: `scaled_pop` (with `filter_source = ""`).
125159

126-
### AR with population scaling and seasonal features
160+
This forecaster is still the standard autoregressive model, but with additional training data.
161+
Inspired by UMass-flusion, the additional training data consisted of historical data from ILI+ and Flusurv+, which was brought to a comprable level with NHSN and treated as additional observations of the target variable (hence the name "augmented data").
162+
Flusurv was taken from epidata, but ILI+ was constructed by Evan Ray and given to Richard (Berkeley Summer 2024 intern).
163+
Naturally, this forecaster was only used for flu, as the same data was not available for covid.
127164

128-
Internal name: `scaled_pop_seasonal`.
165+
#### Scaling Parameters (Data Whitening)
129166

130-
There are 2 seasonal features that we're trying here:
167+
The augmented data forecasters took a few different approaches to data whitening (akin to [RobustScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html#sklearn.preprocessing.RobustScaler) from scikit-learn).
131168

132-
1. taking the first 3 PC components from the whitened fused data (so nhsn, ILI+, and Flusurv). (Note that it's 2 for covid).
133-
2. 2 indicators that roughly correspond to before, during and after the typical peak (first is true when `season_week < 16`, the second is true when `season_week > 20`, and the peak is captured by the overall constant).
134-
Note that unusually, these features are actually led rather than lagged, since we should be predicting using the target's coefficient, rather than the present one.
169+
- `scale_method`
170+
- `quantile` - scales the data so that the difference between the 5th and 95th quantiles is 1
171+
- `quantile_upper` - scales the data so that the 95th quantile is 1 (this was used by UMass-flusion)
172+
- `std` - scales the data so that one standard deviation is 1
173+
- `none` - no scaling
174+
- We did not see a significant difference in changing the above parameter, so we used the default `quantile` the rest of the time.
175+
- `center_method`
176+
- `median` - centers the data so that the median is 0
177+
- `mean` - centers the data so that the mean is 0
178+
- `none` - no centering
179+
- We did not see a significant difference in changing the above parameter, so we used the default `median` the rest of the time.
180+
- `nonlin_method`
181+
- `quart_root` - takes the 4th root of the data (and adds 0.01 to avoid negative values)
182+
- `none` - no non-linear transformation
183+
- Of these, `quart_root` gave us the best results, so we used that the rest of the time. There were occasional issues with the epsilon offset causing a positive value to become the floor as the inversion was taken.
135184

136-
### Flusion-like
185+
### Climatological
137186

138-
Roughly designed in line with the flusion model.
187+
This was our term for a forecaster that directly forecast a distribution built from similar weeks from previous seasons (in analogy with baseline weather forecasting).
188+
We found that in some cases it made a reasonable baseline, though when the current season's peak time was significatly different from the seasons in the training data, it was not particularly effective.
139189

140190
### No Recent Outcome
141191

142-
This is the fall-back forecaster, in case we have no data, but are forced to make a prediction.
192+
This was a fall-back forecaster built for the scenario where NHSN data was not going to reported in time for the start of the forecasting challenge.
143193

144194
A flusion-adjacent model pared down to handle the case of not having the target as a predictor.
145195

146-
$$\bar{x}_{t+k} = f(t_{season}) + p + d + \big\langle y_{t-k}\big\rangle_{k=0:1} + \big\langle y_{t-k}\big\rangle_{t=0:3}$$
147-
148-
where $y$ here is any exogenous variables; initially this will be empty, as nssp is missing some states, so we will have to rewrite these models to handle missing geos (most likely by having a separate model for the case when an exogenous variable is missing).
196+
$$\bar{x}_{t+k} = \big\langle y_{t-k}\big\rangle_{k=0:1} + \big\langle y_{t-k}\big\rangle_{t=0:3}$$
149197

150-
$f$ is either the identity or 2 sine terms, defined so that the first has half a period during the season, and is zero after it, while the second is one period over the season, with zero after
198+
where $y$ here is any set of exogenous variables.
151199

152200
### Flatline
153201

154-
This is what the FluSight-baseline is based on, so they should be identical. However, at the moment, this has scaling issues.
155-
156-
# Covid Forecasts 2024-2025
157-
158-
For now, just AR forecasters with source-pooled data. Forecaster descriptions
159-
are the same as above.
160-
161-
TODO: Get lagged correlations notebook hosted.
202+
A simple "LOCF" forecaster that simply forecasts the last observed value and uses residuals to create a distributional forecast. This is what the FluSight-baseline is based on, so they should be identical.

0 commit comments

Comments
 (0)