Skip to content

Commit 536aeb5

Browse files
igerberclaude
andcommitted
synthetic-control: address CI codex R1 — document equal-weight predictor restriction (P1)
Add a **Note:** in REGISTRY §SyntheticControl (mirrored in docs/api/synthetic_control.rst) that predictor rows support only EQUAL-WEIGHT linear combinations — mean (k_s=1/T0), sum (k_s=1), and per-period outcome lags (identity) — and NOT ADH (2010) §2.3's general arbitrary-weight form Ȳ = Σ_s k_s Y_is (nor non-linear ops like median). The supported set still spans standard Synth::dataprep predictors.op + special.predictors usage; arbitrary-weight K_m is a deferred extension. Documents the restriction introduced when median was dropped, so it is no longer an undocumented methodology deviation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 8a70fa6 commit 536aeb5

2 files changed

Lines changed: 5 additions & 1 deletion

File tree

docs/api/synthetic_control.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,10 @@ derivative-free Powell polish. ``v_method="custom"`` takes a user-supplied ``cus
113113
efficiency-only choice. Predictor/outcome aggregation also **fails closed** on any
114114
non-finite cell, whereas R ``dataprep`` uses ``na.rm=TRUE`` — restrict
115115
``predictor_window`` / ``special_predictors`` periods to where a variable is observed.
116-
See ``docs/methodology/REGISTRY.md`` §SyntheticControl for all deviation labels.
116+
Predictor rows support only **equal-weight** linear combinations (``mean``, ``sum``,
117+
per-period lags); ADH (2010) §2.3's general weighted form ``Σ_s k_s Y_is`` with
118+
arbitrary ``k_s`` (and non-linear ops such as ``median``) is not accepted in this
119+
release. See ``docs/methodology/REGISTRY.md`` §SyntheticControl for all deviation labels.
117120

118121
Example Usage
119122
-------------

docs/methodology/REGISTRY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1981,6 +1981,7 @@ Classic synthetic control (donor/unit weights only) for a single treated unit, d
19811981
- **Note:** `V` is parametrized on the unit simplex via a softmax of an unconstrained vector (trace-normalization is identification-fixing, not a constraint loss); the multistart Nelder-Mead + derivative-free Powell polish approximates R's best-of-`optimx` behavior over the non-smooth outer objective.
19821982
- **Note:** The 1×SD poor-fit threshold is a defensive implementation choice matching the `SyntheticDiD` convention; ADH 2010 gives only the qualitative guidance "do not use SCM when the fit is poor" (no numeric cutoff).
19831983
- **Deviation from R:** `standardize="none"` disables predictor standardization entirely; R `Synth` always scales by the predictor SD. Provided for diagnostics; changes the geometry of the `V` objective.
1984+
- **Note:** predictor rows support only **equal-weight** linear combinations of pre-period values — `mean` (`k_s = 1/T0`), `sum` (`k_s = 1`), and per-period outcome lags (identity, a single `k_s = 1`). ADH (2010) §2.3 defines the general form `Ȳ_i^{K_m} = Σ_s k_s Y_is` with *arbitrary* weights `k_s`; this release does NOT accept user-supplied non-uniform `K_m` weight vectors (and `median` and other non-linear aggregations are intentionally excluded). The supported set still spans the standard `Synth::dataprep` `predictors.op` + `special.predictors` usage; arbitrary-weight `K_m` is a deferred extension.
19841985
- **Deviation from R:** predictor/outcome **aggregation fails closed on any non-finite (NaN/inf) cell**, whereas R `Synth::dataprep` hardwires `na.rm=TRUE` (aggregating over the observed cells of a partially-missing window). The fail-closed contract is deliberate: na-dropping silently aggregates different period subsets across units, yielding incomparable predictors with no warning. The analyst must restrict `predictor_window` / `special_predictors` / `pre_period_outcomes` periods (and the outcome panel) to where each variable is observed; both partially- and fully-missing windows raise `ValueError`. Only the row *ordering* matches `dataprep`, not the missing-data handling.
19851986

19861987
**Reference implementation:** authors' `Synth` package for R/MATLAB/Stata (`Synth::synth`). **R-parity anchor:** the Basque Country study (Abadie-Gardeazabal 2003, `data("basque")`) — published synthetic = region 10 (Cataluña) 0.851 + region 14 (Madrid) 0.149, `loss.v` 0.0089. Two-tier test (`tests/test_methodology_synthetic_control.py`): Tier-1 feeds R's `solution.v` via `custom_v` → donor weights match to atol 1e-3 (deterministic); Tier-2 checks the nested fit in a band.

0 commit comments

Comments
 (0)