|
2 | 2 | "cells": [ |
3 | 3 | { |
4 | 4 | "cell_type": "markdown", |
5 | | - "id": "bc1e6c9b", |
| 5 | + "id": "269c802e", |
6 | 6 | "metadata": {}, |
7 | 7 | "source": [ |
8 | 8 | "# Tutorial 24: Staggered Rollout or a Simple 2×2? A Power-Analysis Decision Guide\n", |
|
34 | 34 | { |
35 | 35 | "cell_type": "code", |
36 | 36 | "execution_count": 1, |
37 | | - "id": "000ed3c8", |
| 37 | + "id": "0c4f03f8", |
38 | 38 | "metadata": { |
39 | 39 | "execution": { |
40 | | - "iopub.execute_input": "2026-05-31T16:40:39.280684Z", |
41 | | - "iopub.status.busy": "2026-05-31T16:40:39.280504Z", |
42 | | - "iopub.status.idle": "2026-05-31T16:40:40.308671Z", |
43 | | - "shell.execute_reply": "2026-05-31T16:40:40.308369Z" |
| 40 | + "iopub.execute_input": "2026-05-31T16:59:51.198041Z", |
| 41 | + "iopub.status.busy": "2026-05-31T16:59:51.197962Z", |
| 42 | + "iopub.status.idle": "2026-05-31T16:59:52.322294Z", |
| 43 | + "shell.execute_reply": "2026-05-31T16:59:52.321996Z" |
44 | 44 | } |
45 | 45 | }, |
46 | 46 | "outputs": [], |
|
114 | 114 | }, |
115 | 115 | { |
116 | 116 | "cell_type": "markdown", |
117 | | - "id": "a2db376f", |
| 117 | + "id": "8400e7f2", |
118 | 118 | "metadata": {}, |
119 | 119 | "source": [ |
120 | 120 | "## The scenario\n", |
|
131 | 131 | { |
132 | 132 | "cell_type": "code", |
133 | 133 | "execution_count": 2, |
134 | | - "id": "9d68d0d4", |
| 134 | + "id": "08de7530", |
135 | 135 | "metadata": { |
136 | 136 | "execution": { |
137 | | - "iopub.execute_input": "2026-05-31T16:40:40.310130Z", |
138 | | - "iopub.status.busy": "2026-05-31T16:40:40.309978Z", |
139 | | - "iopub.status.idle": "2026-05-31T16:40:40.418677Z", |
140 | | - "shell.execute_reply": "2026-05-31T16:40:40.418393Z" |
| 137 | + "iopub.execute_input": "2026-05-31T16:59:52.323568Z", |
| 138 | + "iopub.status.busy": "2026-05-31T16:59:52.323415Z", |
| 139 | + "iopub.status.idle": "2026-05-31T16:59:52.434068Z", |
| 140 | + "shell.execute_reply": "2026-05-31T16:59:52.433793Z" |
141 | 141 | } |
142 | 142 | }, |
143 | 143 | "outputs": [ |
|
272 | 272 | }, |
273 | 273 | { |
274 | 274 | "cell_type": "markdown", |
275 | | - "id": "5c179b0a", |
| 275 | + "id": "99001403", |
276 | 276 | "metadata": {}, |
277 | 277 | "source": [ |
278 | 278 | "## 1. \"Simplifying\" silently changes the question\n", |
|
293 | 293 | { |
294 | 294 | "cell_type": "code", |
295 | 295 | "execution_count": 3, |
296 | | - "id": "e8b4ca62", |
| 296 | + "id": "8496ad42", |
297 | 297 | "metadata": { |
298 | 298 | "execution": { |
299 | | - "iopub.execute_input": "2026-05-31T16:40:40.419755Z", |
300 | | - "iopub.status.busy": "2026-05-31T16:40:40.419678Z", |
301 | | - "iopub.status.idle": "2026-05-31T16:40:40.440839Z", |
302 | | - "shell.execute_reply": "2026-05-31T16:40:40.440602Z" |
| 299 | + "iopub.execute_input": "2026-05-31T16:59:52.435188Z", |
| 300 | + "iopub.status.busy": "2026-05-31T16:59:52.435105Z", |
| 301 | + "iopub.status.idle": "2026-05-31T16:59:52.459409Z", |
| 302 | + "shell.execute_reply": "2026-05-31T16:59:52.459137Z" |
303 | 303 | } |
304 | 304 | }, |
305 | 305 | "outputs": [ |
|
334 | 334 | { |
335 | 335 | "cell_type": "code", |
336 | 336 | "execution_count": 4, |
337 | | - "id": "92b7f831", |
| 337 | + "id": "77f02fec", |
338 | 338 | "metadata": { |
339 | 339 | "execution": { |
340 | | - "iopub.execute_input": "2026-05-31T16:40:40.441827Z", |
341 | | - "iopub.status.busy": "2026-05-31T16:40:40.441747Z", |
342 | | - "iopub.status.idle": "2026-05-31T16:40:40.505346Z", |
343 | | - "shell.execute_reply": "2026-05-31T16:40:40.505071Z" |
| 340 | + "iopub.execute_input": "2026-05-31T16:59:52.460449Z", |
| 341 | + "iopub.status.busy": "2026-05-31T16:59:52.460368Z", |
| 342 | + "iopub.status.idle": "2026-05-31T16:59:52.527876Z", |
| 343 | + "shell.execute_reply": "2026-05-31T16:59:52.527624Z" |
344 | 344 | } |
345 | 345 | }, |
346 | 346 | "outputs": [ |
|
381 | 381 | }, |
382 | 382 | { |
383 | 383 | "cell_type": "markdown", |
384 | | - "id": "794e7433", |
| 384 | + "id": "58d4791c", |
385 | 385 | "metadata": {}, |
386 | 386 | "source": [ |
387 | 387 | "The slower the rollout, the more dilution. Let's sweep rollout speed and, across many\n", |
|
392 | 392 | { |
393 | 393 | "cell_type": "code", |
394 | 394 | "execution_count": 5, |
395 | | - "id": "4521cf5d", |
| 395 | + "id": "cd8bb89e", |
396 | 396 | "metadata": { |
397 | 397 | "execution": { |
398 | | - "iopub.execute_input": "2026-05-31T16:40:40.506364Z", |
399 | | - "iopub.status.busy": "2026-05-31T16:40:40.506293Z", |
400 | | - "iopub.status.idle": "2026-05-31T16:40:48.949713Z", |
401 | | - "shell.execute_reply": "2026-05-31T16:40:48.949465Z" |
| 398 | + "iopub.execute_input": "2026-05-31T16:59:52.528877Z", |
| 399 | + "iopub.status.busy": "2026-05-31T16:59:52.528803Z", |
| 400 | + "iopub.status.idle": "2026-05-31T17:00:01.603356Z", |
| 401 | + "shell.execute_reply": "2026-05-31T17:00:01.603100Z" |
402 | 402 | } |
403 | 403 | }, |
404 | 404 | "outputs": [ |
|
502 | 502 | }, |
503 | 503 | { |
504 | 504 | "cell_type": "markdown", |
505 | | - "id": "3a247dcd", |
| 505 | + "id": "43a88906", |
506 | 506 | "metadata": {}, |
507 | 507 | "source": [ |
508 | 508 | "Why does the slow rollout dilute so much? The 2×2 averages over **all 16 post-weeks**, but\n", |
|
517 | 517 | }, |
518 | 518 | { |
519 | 519 | "cell_type": "markdown", |
520 | | - "id": "2341d75c", |
| 520 | + "id": "c6236105", |
521 | 521 | "metadata": {}, |
522 | 522 | "source": [ |
523 | 523 | "## 2. So does CS cost you power? (the headline)\n", |
|
534 | 534 | { |
535 | 535 | "cell_type": "code", |
536 | 536 | "execution_count": 6, |
537 | | - "id": "52f3e96c", |
| 537 | + "id": "823fbfb7", |
538 | 538 | "metadata": { |
539 | 539 | "execution": { |
540 | | - "iopub.execute_input": "2026-05-31T16:40:48.950751Z", |
541 | | - "iopub.status.busy": "2026-05-31T16:40:48.950663Z", |
542 | | - "iopub.status.idle": "2026-05-31T16:40:51.826627Z", |
543 | | - "shell.execute_reply": "2026-05-31T16:40:51.826336Z" |
| 540 | + "iopub.execute_input": "2026-05-31T17:00:01.604532Z", |
| 541 | + "iopub.status.busy": "2026-05-31T17:00:01.604442Z", |
| 542 | + "iopub.status.idle": "2026-05-31T17:00:04.651956Z", |
| 543 | + "shell.execute_reply": "2026-05-31T17:00:04.651680Z" |
544 | 544 | } |
545 | 545 | }, |
546 | 546 | "outputs": [ |
|
570 | 570 | { |
571 | 571 | "cell_type": "code", |
572 | 572 | "execution_count": 7, |
573 | | - "id": "fb4e7123", |
| 573 | + "id": "5f8e0c4b", |
574 | 574 | "metadata": { |
575 | 575 | "execution": { |
576 | | - "iopub.execute_input": "2026-05-31T16:40:51.827731Z", |
577 | | - "iopub.status.busy": "2026-05-31T16:40:51.827661Z", |
578 | | - "iopub.status.idle": "2026-05-31T16:41:42.127057Z", |
579 | | - "shell.execute_reply": "2026-05-31T16:41:42.126798Z" |
| 576 | + "iopub.execute_input": "2026-05-31T17:00:04.653058Z", |
| 577 | + "iopub.status.busy": "2026-05-31T17:00:04.652982Z", |
| 578 | + "iopub.status.idle": "2026-05-31T17:00:56.428034Z", |
| 579 | + "shell.execute_reply": "2026-05-31T17:00:56.427769Z" |
580 | 580 | } |
581 | 581 | }, |
582 | 582 | "outputs": [ |
|
706 | 706 | }, |
707 | 707 | { |
708 | 708 | "cell_type": "markdown", |
709 | | - "id": "cd29c79d", |
| 709 | + "id": "86537377", |
710 | 710 | "metadata": {}, |
711 | 711 | "source": [ |
712 | 712 | "There's the answer to \"how does the MDE change as the rollout gets more staggered?\" Reading\n", |
|
734 | 734 | }, |
735 | 735 | { |
736 | 736 | "cell_type": "markdown", |
737 | | - "id": "c66efe44", |
| 737 | + "id": "f59b0cbf", |
738 | 738 | "metadata": {}, |
739 | 739 | "source": [ |
740 | 740 | "## A couple of refinements\n", |
|
751 | 751 | { |
752 | 752 | "cell_type": "code", |
753 | 753 | "execution_count": 8, |
754 | | - "id": "1901e49c", |
| 754 | + "id": "9a902d30", |
755 | 755 | "metadata": { |
756 | 756 | "execution": { |
757 | | - "iopub.execute_input": "2026-05-31T16:41:42.128160Z", |
758 | | - "iopub.status.busy": "2026-05-31T16:41:42.128089Z", |
759 | | - "iopub.status.idle": "2026-05-31T16:41:42.978026Z", |
760 | | - "shell.execute_reply": "2026-05-31T16:41:42.977753Z" |
| 757 | + "iopub.execute_input": "2026-05-31T17:00:56.429298Z", |
| 758 | + "iopub.status.busy": "2026-05-31T17:00:56.429210Z", |
| 759 | + "iopub.status.idle": "2026-05-31T17:00:57.301865Z", |
| 760 | + "shell.execute_reply": "2026-05-31T17:00:57.301603Z" |
761 | 761 | } |
762 | 762 | }, |
763 | 763 | "outputs": [ |
|
797 | 797 | }, |
798 | 798 | { |
799 | 799 | "cell_type": "markdown", |
800 | | - "id": "7d119ff3", |
| 800 | + "id": "9dca90b2", |
801 | 801 | "metadata": {}, |
802 | 802 | "source": [ |
803 | | - "**Holdout size.** Geo experiments usually hold back only a few states, and that's actually\n", |
804 | | - "forgiving to CS: a small control group hurts the 2×2's standard error too, so CS's\n", |
805 | | - "*relative* power cost is smallest exactly when the holdout is small.\n", |
| 803 | + "**Holdout size.** Geo experiments usually hold back only a few states. We fix the holdout at\n", |
| 804 | + "10 states here and don't sweep it, but the direction is worth knowing: a small control group\n", |
| 805 | + "hurts the *2×2's* standard error too, so shrinking the holdout doesn't widen CS's relative\n", |
| 806 | + "power gap — if anything it narrows it (a separate sweep, not shown in this notebook, bears\n", |
| 807 | + "this out). Treat this as design intuition, not a demonstrated rule.\n", |
806 | 808 | "\n", |
807 | 809 | "**A 50-state caveat: few clusters.** Our 2×2 helper already clusters by state\n", |
808 | 810 | "(`cluster=\"unit\"`), and with ~50 states (only ~10 controls) cluster-robust SEs lean on\n", |
|
815 | 817 | }, |
816 | 818 | { |
817 | 819 | "cell_type": "markdown", |
818 | | - "id": "5535009f", |
| 820 | + "id": "9b987d86", |
819 | 821 | "metadata": {}, |
820 | 822 | "source": [ |
821 | 823 | "## Decision guide\n", |
|
836 | 838 | }, |
837 | 839 | { |
838 | 840 | "cell_type": "markdown", |
839 | | - "id": "f2937f6d", |
| 841 | + "id": "52d8c4b6", |
840 | 842 | "metadata": {}, |
841 | 843 | "source": [ |
842 | 844 | "## Run this on your own design\n", |
|
0 commit comments