Skip to content

Commit 03fce79

Browse files
igerberclaude
andcommitted
prep_dgp: guarantee all (group, partition) cells in generate_ddd_panel_data
The R1 review identified that independent marginal sampling of `group` and `partition` could leave one of the four (group, partition) cells empty under valid inputs — e.g., `n_units=4, group_frac=partition_frac=0.25` rounds to `n_group_1=1` and `n_p1_g1=0`, so the (1, 1) cell collapses before `TripleDifference.fit`'s 2x2x2 cell-presence check. Switches to stratified allocation: assign `group` to its requested fraction, then within each group stratum, draw `partition=1` at `partition_frac`. Adds a targeted `ValueError` when the rounded cell counts would leave any (group, partition) cell empty (with the four cell counts in the error message so users can pick a feasible config). Adds two regression tests: - `test_infeasible_cell_counts_raise` exercises both the `n_units=4` small-marginal case and an `n_units=10, group_frac=0.1` case. - `test_smallest_feasible_config_populates_all_cells` verifies the smallest feasible config (`n_units=4, fracs=0.5`) yields one unit per cell and that `TripleDifference.fit(..., time="post")` succeeds on it (the contract the docstring advertises). Updates the `group_frac` / `partition_frac` docstring entries to describe the stratified allocation guarantee, and the `[Unreleased]` CHANGELOG entry to mention the cell-coverage invariant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent b23347b commit 03fce79

3 files changed

Lines changed: 96 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased]
99

1010
### Added
11-
- **`generate_ddd_panel_data` — panel-structured DGP for Triple-Difference power analysis** (`diff_diff/prep_dgp.py`). New public function exported from `diff_diff` and `diff_diff.prep` for panel DDD simulations. Cross-sectional `generate_ddd_data` remains available unchanged. Produces a balanced panel of `n_units × n_periods` with two unit-level binary dimensions (`group`, `partition`) and a derived `post = 1[period >= treatment_period]` indicator; columns: `unit, period, outcome, group, partition, post, treated, true_effect` (+ `x1, x2` when `add_covariates=True`). DDD-CPT identification holds because the `group * partition` interaction enters as a unit-level (time-invariant) term, leaving the triple-interaction `treatment_effect * group * partition * post` as the sole source of differential group × partition trend. Compatible with `TripleDifference.fit(..., time="post")` directly — users get panel-realistic unit fixed effects and within-unit serial correlation while the binary 2×2×2 estimator surface is unchanged. Validates `1 <= treatment_period < n_periods`, `group_frac` and `partition_frac` strictly in `(0, 1)`, and `n_units >= 4`. Deterministic recovery (`noise_sd=0`) matches `treatment_effect` to ~1e-15 (covered by `tests/test_prep.py::TestGenerateDddPanelData`, 14 tests). `power.simulate_power` is NOT yet auto-routed to the panel DGP for `TripleDifference` (the existing `_ddd_dgp_kwargs` registry entry still ignores `n_periods` and the existing `_check_ddd_dgp_compat` warning still fires on non-default kwargs) — that wiring is tracked as a follow-up in TODO.md.
11+
- **`generate_ddd_panel_data` — panel-structured DGP for Triple-Difference power analysis** (`diff_diff/prep_dgp.py`). New public function exported from `diff_diff` and `diff_diff.prep` for panel DDD simulations. Cross-sectional `generate_ddd_data` remains available unchanged. Produces a balanced panel of `n_units × n_periods` with two unit-level binary dimensions (`group`, `partition`) and a derived `post = 1[period >= treatment_period]` indicator; columns: `unit, period, outcome, group, partition, post, treated, true_effect` (+ `x1, x2` when `add_covariates=True`). DDD-CPT identification holds because the `group * partition` interaction enters as a unit-level (time-invariant) term, leaving the triple-interaction `treatment_effect * group * partition * post` as the sole source of differential group × partition trend. Compatible with `TripleDifference.fit(..., time="post")` directly — users get panel-realistic unit fixed effects and within-unit serial correlation while the binary 2×2×2 estimator surface is unchanged. **Stratified allocation:** the partition split is drawn stratified-by-group at the requested `partition_frac` so every `(group, partition)` cell receives at least one unit; a targeted `ValueError` is raised at fit-time when the rounded cell counts (`n_units`, `group_frac`, `partition_frac`) would leave any cell empty. This guarantees the 2x2x2 DDD surface is populated for any valid input — independent marginal sampling (the cross-sectional `generate_ddd_data` convention) could collapse cells when marginals are small (e.g., `n_units=4, group_frac=partition_frac=0.25`). Validates `1 <= treatment_period < n_periods`, `group_frac` and `partition_frac` strictly in `(0, 1)`, and `n_units >= 4`. Deterministic recovery (`noise_sd=0`) matches `treatment_effect` to ~1e-15 (covered by `tests/test_prep.py::TestGenerateDddPanelData`, 16 tests including infeasible-config rejection and smallest-feasible-config round-trip through `TripleDifference.fit`). `power.simulate_power` is NOT yet auto-routed to the panel DGP for `TripleDifference` (the existing `_ddd_dgp_kwargs` registry entry still ignores `n_periods` and the existing `_check_ddd_dgp_compat` warning still fires on non-default kwargs) — that wiring is tracked as a follow-up in TODO.md.
1212
- **`SpilloverDiD` — ring-indicator spillover-aware DiD (Butts 2021).** New standalone estimator at `diff_diff/spillover.py` implementing two-stage Gardner methodology with ring-indicator covariates that identify direct effect on treated (`tau_total`) alongside per-ring spillover effects on near-control units (`delta_j`). Documented synthesis of ingredients (no single published software covers the exact recipe — `did2s` implements Gardner two-stage without rings; the Butts ring estimator has no R/Stata package): Butts (2021) Section 5 / Table 2 identification, Gardner (2022) two-stage residualize-then-fit, and the Conley spatial-HAC vcov shipped in 3.3.3. Handles both panel non-staggered (Equations 5/6/8) and Section 5 staggered timing in one estimator — non-staggered is the special case where all treated units share an onset time. **API:** `SpilloverDiD(rings=[0, 50, 100, 200], conley_coords=("lat","lon"), ...).fit(data, outcome="y", unit="unit", time="t", treatment="D")` (binary D auto-converted to `first_treat`) or `.fit(..., first_treat="first_treat")` (Gardner convention). Result: `SpilloverDiDResults(DiDResults)` with `.att` = `tau_total`, `.spillover_effects` (per-ring `pd.DataFrame` with `coef`/`se`/`t_stat`/`p_value`/`ci_low`/`ci_high`), `.ring_breakpoints`, `.d_bar`, `.n_units_ever_in_ring`, `.n_far_away_obs`, `.is_staggered`. `.coefficients` exposes all `(1+K)` stage-2 entries (`"treatment"` + `"_spillover_<ring_label>"`) plus an `"ATT"` alias keyed to vcov columns. **Methodology spec (committed):** stage-2 regressor is the time-varying `(1 - D_it) * Ring_{it,j}` form (paper page 12's `S_it = S_i * 1{t >= t_treat}` notation; Section 5 Table 2's `S^k_{it}` / `Ring^k_{it,j}`). Reading the literal unit-static `(1 - D_it) * S_i` from Equation 5 is algebraically rank-deficient under TWFE (`(1-D_it) * S_i = S_i - D_it`, with `S_i` absorbed by `mu_i`, leaving `-D_it`); only the time-varying form supports the paper's identification (Proposition 2.3). Stage-1 subsample uses Butts' STRICTER `Omega_0 = {D_it = 0 AND S_it = 0}` (untreated AND unexposed), not TwoStageDiD's `{D_it = 0}` alone — this prevents spillover-contaminated near-controls in pre/post periods from biasing the time FE. **Gardner identity (non-staggered):** a 20-seed deterministic regression test pins `SpilloverDiD.att` against a direct single-stage TWFE ring regression on the full sample (`y ~ mu_i + lambda_t + tau * D_it + sum_j delta_j * (1 - D_it) * Ring_{it,j}`) at `atol=1e-10` — empirically bit-identical, so the reported non-staggered `tau_total` IS the Butts Eqs. 4-6 estimator. **Identification-check policy (period strict, unit warn-and-drop, plus connectivity):** every period must have at least one Omega_0 row (hard `ValueError` — dropping a period removes all units' cross-time identification). Units lacking Omega_0 rows (e.g. baseline-treated units with `D_it = 1` at every observed `t`) are warned-and-dropped: their unit FE is NaN, residualization writes NaN on their rows, and the downstream finite-mask path excludes them from stage 2 — mirrors `TwoStageDiD`'s always-treated convention. Additionally, the supported-units bipartite graph (units linked by shared Omega_0 periods) must form a single connected component; `K > 1` components raise `ValueError` because the FE solver would return only component-specific constants and residualization would silently mix them across components (defense-in-depth — under absorbing treatment the disconnected case may be unreachable through the upstream validators, but the check future-proofs Wave B follow-ups). **Public API restrictions (Wave B MVP):** `covariates=` raises `NotImplementedError` because Gardner-style two-stage requires covariate effects estimated on the untreated-and-unexposed subsample at stage 1 (appending raw covariates only at stage 2 silently biases `tau_total` / `delta_j` on panels with time-varying covariates); non-absorbing / reversible treatment patterns (e.g. `[0, 1, 0]`) raise `ValueError` rather than being silently coerced into "treated from first 1 onward"; non-constant `first_treat` values across rows of the same unit raise `ValueError`; `conley_coords` is required on every fit path (not just `vcov_type="conley"`) because ring construction always uses it. **Far-away control identification:** uses CURRENT-period untreated status (`D_it = 0`) rather than never-treated-only, so all-eventually-treated staggered designs (no never-treated units) can identify the counterfactual via not-yet-treated far-away rows. **Variance (Wave B MVP):** stage-2 OLS variance via `solve_ols` (HC1 / Conley / cluster paths all flow through). The Gardner GMM first-stage uncertainty correction is NOT applied at stage 2 in this PR (documented limitation; planned follow-up extends `two_stage.py::_compute_gmm_variance` to accept a Conley kernel matrix in place of HC1's identity at the influence-function outer-product step). **Deferred features (planned follow-ups):** `event_study=True` per-event-time × ring coefficients (Butts Table 2), `survey_design=` integration, `ring_method="count"` (count-of-treated-in-ring), data-driven `d_bar` selection (Butts 2021b / Butts 2023 JUE Insight), Gardner GMM first-stage correction at stage 2, sparse staggered ring-distance path. **Tests:** `tests/test_spillover.py` (157 tests across ring-construction primitives, validators, fit integration, raw-data invariant, identification MC — non-staggered DGP at 50 seeds + 200-seed `@pytest.mark.slow` variant recovers both `tau_total` and `delta_1`; staggered DGP at 30 seeds anchors both `tau_total` and `delta_1` — Conley plumbing (verifies `solve_ols` is called with `vcov_type="conley"` + Conley kwargs, no silent HC1 fallback), Gardner identity bit-identity, coefficients-vs-vcov alignment, warn-and-drop, rank_deficient_action validation, Omega_0 bipartite-graph connectivity, anticipation behavior on both fit paths). DGP factories `tests/_dgp_utils.py::generate_butts_nonstaggered_dgp` / `generate_butts_staggered_dgp` satisfy Butts Assumptions 1/3/5/7 by construction.
1313
- **`ChaisemartinDHaultfoeuille.predict_het` × `placebo`: R-parity on both global and per-path surfaces.** R-verified — `did_multiplegt_dyn(predict_het, placebo)` emits heterogeneity OLS results on backward (placebo) horizons via R's `DIDmultiplegtDYN:::did_multiplegt_main` placebo block (`effect = matrix(-i, ...)` rbind site); the same block runs per-by_level under `did_multiplegt_dyn(by_path, predict_het, placebo)`, so both global `res$results$predict_het` and per-by_level `res$by_level_i$results$predict_het` slots emit backward rows. R's predict_het syntax with `placebo > 0` requires the `c(-1)` sentinel in the horizon vector to trigger "compute heterogeneity for ALL forward (1..effects) AND ALL placebo (1..placebo) positions" — passing positive-only horizons errors with "specified numbers in predict_het that exceed the number of placebos". Python mirrors via `_compute_heterogeneity_test(..., placebo=L_max)` (set automatically from `self.placebo` at both global and per-path call sites in `fit()`) — the function iterates forward (1..L_max) and backward (-1..-L_max) horizons in a single loop with an explicit `out_idx < 0` eligibility guard for backward horizons whose `F_g` is too small (would otherwise silently misread `N_mat` via numpy negative indexing). `results.heterogeneity_effects` uses negative-int keys for backward horizons; `path_heterogeneity_effects` does the same per path. Placebo rows in `to_dataframe(level="by_path")` have non-NaN `het_*` columns when `placebo=True` and `heterogeneity=` are both set. **Survey gate (warn + skip):** `survey_design + placebo + heterogeneity` emits a `UserWarning` at fit-time and falls back to forward-horizon-only heterogeneity on both surfaces — the Binder TSL cell-period allocator's REGISTRY justification is tied to **post-period** attribution; backward-horizon attribution puts ψ_g mass on a pre-period cell, a separate library-extension claim that needs its own derivation. Forward-horizon `predict_het + survey_design` continues to work unchanged on both global and per-path surfaces. The function-level `_compute_heterogeneity_test` keeps a per-iteration `NotImplementedError` backstop for direct callers that bypass fit(). Pre-period allocator derivation deferred to a follow-up methodology PR (tracked in TODO.md). R parity confirmed at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneityWithPlacebo` (scenario 23, `multi_path_reversible_predict_het_with_placebo_global`, `placebo=2, effects=3, no by_path`) and `::TestDCDHDynRParityByPathHeterogeneityWithPlacebo` (scenario 22, same DGP plus `by_path=3`); pinned at `BETA_RTOL=1e-6` / `SE_RTOL=1e-5` for `beta` / `se` / `t_stat` / `n_obs` and `INFERENCE_RTOL=1e-4` for `p_value` / `conf_int` across 3 paths × (3 forward + 2 placebo) = 15 horizons + 1 global × 5 horizons. Cross-surface invariants regression-tested at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPredictHetPlacebo` (placebo het column population, survey-gate warn+skip behavior, forward+survey anti-regression, `out_idx<0` eligibility guard, single-path telescope `path_heterogeneity_effects[(only_path,)] == heterogeneity_effects` bit-exactly, summary rendering, direct-call `NotImplementedError` backstop). Closes TODO #422.
1414

diff_diff/prep_dgp.py

Lines changed: 42 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1188,10 +1188,16 @@ def generate_ddd_panel_data(
11881188
Period (0-indexed) at which ``post`` switches from 0 to 1.
11891189
Must satisfy ``1 <= treatment_period < n_periods``.
11901190
group_frac : float, default=0.5
1191-
Fraction of units with ``group=1``. Must be in ``(0, 1)``.
1191+
Fraction of units with ``group=1``. Must be in ``(0, 1)``. The
1192+
partition split is then drawn stratified-by-group at the requested
1193+
``partition_frac`` so every (group, partition) cell receives at
1194+
least one unit; a ``ValueError`` is raised when the rounded cell
1195+
counts would leave any cell empty.
11921196
partition_frac : float, default=0.5
1193-
Fraction of units with ``partition=1`` (assigned independently of
1194-
``group``). Must be in ``(0, 1)``.
1197+
Fraction of units with ``partition=1`` within each ``group``
1198+
stratum. Must be in ``(0, 1)``. The stratified allocation is what
1199+
makes TripleDifference.fit's 2x2x2 surface populated for any valid
1200+
``(n_units, group_frac, partition_frac)``.
11951201
treatment_effect : float, default=2.0
11961202
True ATT for the triple-interaction cell (group=1, partition=1,
11971203
post=1).
@@ -1270,16 +1276,45 @@ def generate_ddd_panel_data(
12701276
f"got {n_units}."
12711277
)
12721278

1279+
# Stratified allocation that guarantees every (group, partition) cell receives
1280+
# at least one unit. Independent marginal sampling could collapse a cell when
1281+
# rounded marginals are small (e.g., n_units=4, group_frac=partition_frac=0.25
1282+
# yields n_group_1=1 with no room for both partition values within group=1).
1283+
# TripleDifference.fit requires all 8 G×P×T cells, so we validate first.
1284+
n_group_1 = int(round(n_units * group_frac))
1285+
n_group_0 = n_units - n_group_1
1286+
n_p1_g0 = int(round(n_group_0 * partition_frac))
1287+
n_p1_g1 = int(round(n_group_1 * partition_frac))
1288+
n_p0_g0 = n_group_0 - n_p1_g0
1289+
n_p0_g1 = n_group_1 - n_p1_g1
1290+
cell_counts = {
1291+
(0, 0): n_p0_g0,
1292+
(0, 1): n_p1_g0,
1293+
(1, 0): n_p0_g1,
1294+
(1, 1): n_p1_g1,
1295+
}
1296+
if min(cell_counts.values()) < 1:
1297+
raise ValueError(
1298+
f"generate_ddd_panel_data requires every (group, partition) cell to "
1299+
f"contain at least one unit so TripleDifference.fit's 2x2x2 surface is "
1300+
f"populated. With n_units={n_units}, group_frac={group_frac}, "
1301+
f"partition_frac={partition_frac}, the rounded cell counts are "
1302+
f"(group, partition): (0,0)={n_p0_g0}, (0,1)={n_p1_g0}, "
1303+
f"(1,0)={n_p0_g1}, (1,1)={n_p1_g1}. Increase n_units or move "
1304+
f"group_frac/partition_frac closer to 0.5 so each cell receives "
1305+
f">=1 unit."
1306+
)
1307+
12731308
rng = np.random.default_rng(seed)
12741309

1275-
n_group_1 = max(1, min(n_units - 1, int(round(n_units * group_frac))))
1276-
n_partition_1 = max(1, min(n_units - 1, int(round(n_units * partition_frac))))
12771310
group = np.zeros(n_units, dtype=int)
12781311
group_idx = rng.choice(n_units, size=n_group_1, replace=False)
12791312
group[group_idx] = 1
12801313
partition = np.zeros(n_units, dtype=int)
1281-
partition_idx = rng.choice(n_units, size=n_partition_1, replace=False)
1282-
partition[partition_idx] = 1
1314+
g0_units = np.where(group == 0)[0]
1315+
g1_units = np.where(group == 1)[0]
1316+
partition[rng.choice(g0_units, size=n_p1_g0, replace=False)] = 1
1317+
partition[rng.choice(g1_units, size=n_p1_g1, replace=False)] = 1
12831318

12841319
unit_fe = rng.normal(0.0, unit_fe_sd, size=n_units)
12851320

tests/test_prep.py

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1234,6 +1234,59 @@ def test_n_units_validation(self):
12341234
with pytest.raises(ValueError, match="n_units"):
12351235
generate_ddd_panel_data(n_units=3, seed=42)
12361236

1237+
def test_infeasible_cell_counts_raise(self):
1238+
"""Configs that round to empty (group, partition) cells fail fast."""
1239+
from diff_diff.prep import generate_ddd_panel_data
1240+
1241+
# n_units=4, group_frac=0.25 → n_group_1=1; partition_frac=0.25 inside
1242+
# the 1-unit group=1 stratum rounds to 0, leaving the (1, 1) cell empty.
1243+
with pytest.raises(ValueError, match=r"every \(group, partition\) cell"):
1244+
generate_ddd_panel_data(
1245+
n_units=4,
1246+
group_frac=0.25,
1247+
partition_frac=0.25,
1248+
seed=42,
1249+
)
1250+
# n_units=10, group_frac=0.1 → n_group_1=1 again; same failure mode for
1251+
# the (1, 1) cell.
1252+
with pytest.raises(ValueError, match=r"every \(group, partition\) cell"):
1253+
generate_ddd_panel_data(
1254+
n_units=10,
1255+
group_frac=0.1,
1256+
partition_frac=0.5,
1257+
seed=42,
1258+
)
1259+
1260+
def test_smallest_feasible_config_populates_all_cells(self):
1261+
"""n_units=4 with fracs=0.5 yields one unit per (group, partition) cell."""
1262+
from diff_diff import TripleDifference
1263+
from diff_diff.prep import generate_ddd_panel_data
1264+
1265+
data = generate_ddd_panel_data(
1266+
n_units=4,
1267+
n_periods=4,
1268+
treatment_period=2,
1269+
group_frac=0.5,
1270+
partition_frac=0.5,
1271+
noise_sd=0.0,
1272+
seed=42,
1273+
)
1274+
# All four (group, partition) cells must receive exactly one unit.
1275+
unit_cells = data.groupby("unit")[["group", "partition"]].first()
1276+
cell_counts = unit_cells.groupby(["group", "partition"]).size()
1277+
assert len(cell_counts) == 4
1278+
assert (cell_counts == 1).all()
1279+
# TripleDifference.fit (which requires all 8 G×P×T cells) must succeed
1280+
# on the smallest feasible config — validates the public contract
1281+
# advertised in the docstring's compatibility paragraph.
1282+
TripleDifference().fit(
1283+
data,
1284+
outcome="outcome",
1285+
group="group",
1286+
partition="partition",
1287+
time="post",
1288+
)
1289+
12371290
def test_reproducibility(self):
12381291
"""Same seed produces identical DataFrames."""
12391292
from diff_diff.prep import generate_ddd_panel_data

0 commit comments

Comments
 (0)