Status: scaffold (v0.1.0-dev0). Pre-registration is locked at OSF; data ingestion and analysis pipelines are under construction per the project dev plan (lab-internal). The Zenodo DOI badge above will be minted at the first tagged release.
Does on-pitch player aggression rise with the day-of-match temperature anomaly at the stadium, and does the same heat signal extend from the pitch to the supporters? ThermoFooty pre-registers and executes a natural-experiment test of the heat-aggression hypothesis on European soccer.
The design exploits the fact that fixtures are scheduled before weather is realised, eliminating the Field-1992 outdoor-opportunity confound that limits modern crime-data designs. The same scheduled- fixture identification underlies every analysis in the project, from the Big-5 European league panel (H1: ~150 000+ matches) to the tournament panel (H6/H6b on Qatar 2022 Stadium 974 as a within- tournament natural-control on cooled vs naturally-ventilated venues).
The full pre-registered design — primary confirmatory test plus 17 auxiliary hypotheses across three independently BH-FDR-corrected batteries — is locked at OSF (10.17605/OSF.IO/YZVAK) with an AsPredicted one-pager cross-post for the H1 confirmatory test (aspredicted.org/av2un9.pdf).
ThermoFooty is one chapter of a three-track cross-species programme:
- ThermoKourt — Drosophila track. Behavioural-arena heat-aggression assays under controlled thermal manipulation.
- ThermoStrife (Zenodo DOI 10.5281/zenodo.20371612) — human-data track, historical-uprisings analysis. 112-event case-crossover panel 1750–2024 with four-tier weather backfill; headline OR = 1.089 per +1 °C above local same-month baseline.
- ThermoFooty (this repo) — human-data track, soccer panel. Pre-registered natural-experiment test on scheduled fixtures 1970–2026, addressing the small-n + selection-bias critiques of the ThermoStrife historical panel.
The three tracks publish separately but share the conceptual hypothesis and (where appropriate) code: the four-tier weather cascade ThermoFooty uses is vendored verbatim from ThermoStrife v0.1.1, and every statistical estimator routes through reRandomStats v0.2.0+ (case_crossover, model_comparison, dose_response).
| Battery | Hypothesis | Quick description |
|---|---|---|
| PRIMARY | H1 | Per-match red-card-for-violent-conduct odds rise with stadium-day Tmax anomaly. Time-stratified case-crossover conditional logit on Big-5 1970–2026. Single confirmatory test, uncorrected α = 0.05, one-sided. |
| LEAGUE auxiliary (7 tests, BH FDR q = 0.05) | H2 | Crowd-violence arrests (pooled UK Home Office + ZIS-Jahresberichte) rise with the same anomaly exposure. |
| H3 | Heat coefficient attenuated in closed-roof / cooled stadia. | |
| H4 / H4b | Heat × stakes interaction on player cards / crowd arrests. | |
| H5 | Within-player FE: same player carded more in hot matches. | |
| H0_spec | Aggression-set cards rise faster than non-card fouls (mechanism specificity). | |
| H_league_het | LRT for cross-league slope heterogeneity. | |
| DOSE-RESPONSE (4 tests, BH FDR q = 0.05) | H_break_pop / H_break_player | Segmented regression + Davies test + 4PL Hill rescue; population and per-player breakpoints. |
| H_mobility_transfer / H_mobility_dual | Player-transfer natural experiment on absolute-vs-anomaly exposure. | |
| TOURNAMENT (6 tests, BH FDR q = 0.05) | H6 | Cooled-stadia attenuation in pooled tournament panel. |
| H6b | Qatar 2022 Stadium 974 (naturally ventilated, n=7) vs the seven cooled venues (n=57). | |
| H7 / H7c | Hot-vs-cool host World Cups (Qatar excluded; Qatar as own descriptive category). | |
| H8 / H_omnibus | Tournament-family / tournament-edition heterogeneity LRTs. |
Full specifications are locked in the OSF pre-registration (10.17605/OSF.IO/YZVAK); the lab's internal source draft is mirrored there verbatim.
ThermoFooty/
├── pyproject.toml
├── CITATION.cff
├── environment.yml
├── LICENSE ← MIT
├── README.md
├── data → $THERMOFOOTY_DATA_ROOT ← symlink (gitignored)
├── db/
│ ├── schema.sql ← committed canonical DDL
│ └── migrations/ ← alembic-lite NNNN_<slug>.sql
├── thermofooty/ ← Python package
│ ├── __init__.py
│ ├── constants.py ← Wong palette, paths, type aliases
│ ├── config.py ← THERMOFOOTY_DATA_ROOT env var
│ ├── db/ ← SQLite session, schema-version check
│ ├── sources/ ← football_data_uk, fbref, home_office, zis
│ ├── weather/ ← vendored cascade from ThermoStrife v0.1.1
│ ├── lookup.py ← (stadium, date) → AnomalyFetch
│ ├── panel.py ← analysis_panel materialiser
│ ├── inference.py ← thin wrapper around reRandomStats
│ └── viz.py ← Wong-palette figures
├── scripts/ ← ingestion + analysis CLI scripts
├── tests/
├── docs/ ← Sphinx docs
└── .github/workflows/ ← tests + docs + release + network-tests
All data lives off-repo under $THERMOFOOTY_DATA_ROOT/, exposed via
the gitignored data/ symlink. Set the env var to wherever your fast
storage lives (an external NVMe, a network mount, the HPC scratch
directory, …).
Each entry is tagged with the data family it belongs to: [db] canonical store · [match] football events + lineups · [crowd] crowd-violence reports · [stadium] geometry + metadata · [weather] temperature sources · [derived] materialised analysis tables · [ops] logs & housekeeping.
$THERMOFOOTY_DATA_ROOT/
├── db/
│ └── thermofooty.sqlite ← [db] canonical SQLite (built from db/schema.sql)
├── raw/
│ ├── football_data_uk/ ← [match] season-per-CSV downloads
│ ├── fbref_html/ ← [match] scraped match-report HTML cache
│ ├── home_office_pdfs/ ← [crowd] UK arrests bulletins
│ ├── zis_jahresberichte/ ← [crowd] Bundespolizei annual reports
│ ├── stadia/ ← [stadium] coordinate CSVs, lineup overrides
│ └── observatories/hadcet/ ← [weather] HadCET daily totals files
├── cache/
│ ├── meteostat/ ← [weather] parquet per (station, year-month)
│ ├── era5/ ← [weather] parquet per (cell, year-month)
│ ├── twentycr/ ← [weather] parquet per (cell, year)
│ └── fbref_parsed/ ← [match] parsed JSON per match (dedupe key)
├── derived/
│ └── analysis_panel.parquet ← [derived] materialised join per ingestion pass
└── logs/ ← [ops] ingestion + analysis logs
Two supported routes — pick one. The conda route is the reproducible
default; the pip route is lighter if you already have a Python ≥ 3.11
on $PATH.
environment.yml pins Python 3.11 + every dependency the cascade,
ingestion, and inference layers need, plus rerandomstats from the
locked v0.2.0 tag. One command bootstraps the whole stack:
git clone https://github.com/zerotonin/ThermoFooty.git
cd ThermoFooty
conda env create -f environment.yml # creates the `thermofooty` env
conda activate thermofooty
pip install -e . --no-deps # adds ThermoFooty itself in editable mode
# Point `data/` at wherever your fast storage lives.
export THERMOFOOTY_DATA_ROOT=/path/to/your/ThermoFooty
ln -sf "$THERMOFOOTY_DATA_ROOT" dataTo refresh after a dependency bump: conda env update -f environment.yml --prune.
git clone https://github.com/zerotonin/ThermoFooty.git
cd ThermoFooty
pip install -e ".[all]"
# Point `data/` at wherever your fast storage lives.
export THERMOFOOTY_DATA_ROOT=/path/to/your/ThermoFooty
ln -sf "$THERMOFOOTY_DATA_ROOT" dataPrefer not to rely on the env var being exported in every shell?
Copy the committed template and fill it in — local_paths.json is
gitignored so absolute paths never leak into a commit:
cp local_paths.template.json local_paths.json
# edit local_paths.json -> set data_root to your absolute pathResolution order, first hit wins: env var → local_paths.json →
in-repo data/ symlink.
Python ≥ 3.11 required (meteostat 2.x dropped 3.10). For the ERA5
fallback tier you additionally need a free
Copernicus CDS API key
in ~/.cdsapirc (gitignored).
If you use ThermoFooty in published work, please cite both the software (version DOI to appear on first GitHub Release) and the underlying OSF pre-registration:
Geurten, B. R. H. (2026). ThermoFooty: heat as an acute trigger of on-pitch aggression — pre-registered natural-experiment test on European soccer. OSF. https://doi.org/10.17605/OSF.IO/YZVAK
Full citation metadata in CITATION.cff. Companion citations for
ThermoStrife and
reRandomStats are listed
in the same file under references.
Bart R. H. Geurten — Department of Zoology, University of Otago, Dunedin, New Zealand. ORCID 0000-0002-1816-3241.
MIT — see LICENSE.