Skip to content

Commit 2b4cb8d

Browse files
igerberclaude
andcommitted
tutorial: address CI codex P3 (round 2) — drop unseen-evidence appeal in holdout note
The holdout paragraph still referenced 'a separate sweep, not shown' which appeals to evidence the reader can't verify. Reframe purely as a mechanism/hypothesis to test on your own design, explicitly noting this notebook does not measure it. Markdown-only change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent a68f1b5 commit 2b4cb8d

1 file changed

Lines changed: 2 additions & 16 deletions

File tree

docs/tutorials/24_staggered_vs_collapsed_power.ipynb

Lines changed: 2 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -799,21 +799,7 @@
799799
"cell_type": "markdown",
800800
"id": "9dca90b2",
801801
"metadata": {},
802-
"source": [
803-
"**Holdout size.** Geo experiments usually hold back only a few states. We fix the holdout at\n",
804-
"10 states here and don't sweep it, but the direction is worth knowing: a small control group\n",
805-
"hurts the *2×2's* standard error too, so shrinking the holdout doesn't widen CS's relative\n",
806-
"power gap — if anything it narrows it (a separate sweep, not shown in this notebook, bears\n",
807-
"this out). Treat this as design intuition, not a demonstrated rule.\n",
808-
"\n",
809-
"**A 50-state caveat: few clusters.** Our 2×2 helper already clusters by state\n",
810-
"(`cluster=\"unit\"`), and with ~50 states (only ~10 controls) cluster-robust SEs lean on\n",
811-
"large-sample approximations that are shaky at this scale. For a real 50-state test, prefer\n",
812-
"wild-cluster bootstrap or small-sample corrections: `DifferenceInDifferences` supports\n",
813-
"`inference=\"wild_bootstrap\"` (it resamples at the cluster level), and `CallawaySantAnna`\n",
814-
"supports a multiplier bootstrap via `n_bootstrap=`. See the estimator docstrings for the\n",
815-
"exact requirements."
816-
]
802+
"source": "**Holdout size.** Geo experiments usually hold back only a few states. We hold this fixed at\n10 and don't vary it. One mechanism to keep in mind if you do vary it on your own design: a\nsmall control group inflates the *2×2's* standard error too — not only CS's — so a smaller\nholdout won't necessarily widen the CS-vs-2×2 power gap. This notebook doesn't measure that,\nso treat it as a hypothesis to test, not a result.\n\n**A 50-state caveat: few clusters.** Our 2×2 helper already clusters by state\n(`cluster=\"unit\"`), and with ~50 states (only ~10 controls) cluster-robust SEs lean on\nlarge-sample approximations that are shaky at this scale. For a real 50-state test, prefer\nwild-cluster bootstrap or small-sample corrections: `DifferenceInDifferences` supports\n`inference=\"wild_bootstrap\"` (it resamples at the cluster level), and `CallawaySantAnna`\nsupports a multiplier bootstrap via `n_bootstrap=`. See the estimator docstrings for the\nexact requirements."
817803
},
818804
{
819805
"cell_type": "markdown",
@@ -877,4 +863,4 @@
877863
},
878864
"nbformat": 4,
879865
"nbformat_minor": 5
880-
}
866+
}

0 commit comments

Comments
 (0)