forked from OHDSI/ShinyDeploy
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathMyArticle.Rmd
707 lines (602 loc) · 35.1 KB
/
MyArticle.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
---
params:
databaseId: "CCAE"
targetId: 1308842
comparatorId: 40226742
outcomeId: 2
indicationId: "Hypertension"
root: "."
primary: 1
ot: 1
itt: 2
matchOt: 3
matchItt: 4
title: "Acute myocardial infarction risk in new-users of valsartan versus olmesartan for hypertension in the JMDC database"
abstract: "We conduct a large-scale study on the incidence of acute myocardial infarction among new users of valsartan and olmesartan from 2006 to 2017 in the JMDC database. Outcomes of interest are estimates of the hazard ratio (HR) for incident events between comparable new users under on-treatment and intent-to-treat risk window assumptions. We identify 7316 valsartan and 9744 olmesartan patients for the on-treatment design, totaling 9093 and 9673 patient-years of observation, and 11 and 16 events respectively. We control for measured confounding using propensity score trimming and stratification or matching based on an expansive propensity score model that includes all measured patient features before treatment initiation. We account for unmeasured confounding using negative and positive controls to estimate and adjust for residual systematic bias in the study design and data source, providing calibrated confidence intervals and p-values. In terms of acute myocardial infarction, valsartan has a similar risk as compared to olmesartan [HR: 0.88, 95% confidence interval (CI) 0.39 - 2.01]."
save: NULL
load: NULL
title: "`r params$title`"
# Corresponding author: Martijn J. Schuemie, Janssen R&D, 1125 Trenton Harbourton Road, Titusville, NJ, 08560, Phone: +31 631793897, [email protected]
author:
- name: Martijn J. Schuemie
affiliation: a,b,c
- name: Patrick B. Ryan
affiliation: a,b,d
- name: Seng Chan You
affiliation: a,e
- name: Nicole Pratt
affiliation: a,f
- name: David Madigan
affiliation: a,g
- name: George Hripcsak
affiliation: a,d
- name: Marc A. Suchard
affiliation: a,c,h,i
address:
- code: a
address: Observational Health Data Sciences and Informatics, New York, NY, USA
- code: b
address: Janssen Research & Development, Titusville, NJ, USA
- code: c
address: Department of Biostatistics, University of California, Los Angeles, CA
- code: d
address: Department of Biomedical Informatics, Columbia University, New York, NY
- code: e
address: Department of Biomedical Informatics, Ajou University, Suwon, South Korea
- code: f
address: Sansom Institute, University of South Australia, Adelaide SA, Australia
- code: g
address: Department of Statistics, Columbia University, New York, NY
- code: h
address: Department of Biomathematics, University of California, Los Angeles, CA
- code: i
address: Department of Human Genetics, University of California, Los Angeles, CA
lead_author_surname: Schuemie et al.
# Place DOI URL or CRAN Package URL here
# doi: "https://cran.r-project.org/package=YourPackage"
# Abstract
abstract: "`r params$abstract`"
# Optional: Acknowledgements
# acknowledgements: |
# This template package builds upon, and extends, the work of the excellent
# [rticles](https://cran.r-project.org/package=rticles) package, and both packages rely on the
# [PNAS LaTeX](http://www.pnas.org/site/authors/latex.xhtml) macros. Both these sources are
# gratefully acknowledged as this work would not have been possible without them. Our extensions
# are under the same respective licensing term
# ([GPL-3](https://www.gnu.org/licenses/gpl-3.0.en.html) and
# [LPPL (>= 1.3)](https://www.latex-project.org/lppl/)).
# Optional: One or more keywords
keywords:
- new-user cohort design
- comparative effectiveness
- drug safety
# Paper size for the document, values of letterpaper and a4paper
papersize: letter
# Font size of the document, values of 9pt (default), 10pt, 11pt and 12pt
fontsize: 9pt
# Optional: Force one-column layout, default is two-column
#one_column: true
# Optional: Enables lineno mode, but only if one_column mode is also true
#lineno: true
# Optional: Enable one-sided layout, default is two-sided
#one_sided: true
# Optional: Enable section numbering, default is unnumbered
#numbersections: true
# Optional: Specify the depth of section number, default is 5
#secnumdepth: 5
# Optional: Skip inserting final break between acknowledgements, default is false
skip_final_break: true
# Optional: Bibliography
# bibliography: pinp
# Optional: Enable a 'Draft' watermark on the document
watermark: false
# Customize footer, eg by referencing the vignette
footer_contents: OHDSI version 1.0
# Produce a pinp document
output:
pdf_document:
template: blank_template.tex
citation_package: natbib
pnas_type: pnasresearcharticle
bibliography: ohdsi.bib
#csl: pnas.csl
#output: rticles::pnas_article
author_declaration: MJS and PBR are employees and share-holders of Janssen Research. MAS receives contract support from Janssen Research.
# Required: Vignette metadata for inclusion in a package.
vignette: >
%\VignetteIndexEntry{YourPackage-vignetteentry}
%\VignetteKeywords{YourPackage, r, anotherkeyword}
%\VignettePackage{YourPackage}
%\VignetteEngine{knitr::rmarkdown}
---
```{r, setup, echo=FALSE, message=FALSE,comment=FALSE,warning=FALSE}
library(DatabaseConnector)
library(knitr)
library(xtable)
library(ggplot2)
# library(kableExtra)
source("DataPulls.R")
source("PlotsAndTables.R")
options(knitr.kable.NA = '')
extraCriteria <- list(
depression = "Further, we exclude patients with diagnoses of bipolar disorder or schizophrenia on or prior to their index date."
)
extraCovariates <- list(
depression = "Prior number of depression treatments (1, 2, 3, 4, 5 or higher)"
)
negativeControls <- list(
hypertension = 76
)
totalOutcomes <- list(
hypertension = 57
)
totalExposures <- list(
hypertension = "56 different anti-hypertensive medications, representing 15 different drug-class levels and 7 major drug class levels",
stuff = "at least 2500 patients in both target and comparator cohors"
)
abbr <- read.table(header = TRUE, sep = ",", stringsAsFactors = FALSE, text = "
name,shortName
ACE inhibitors,ACEIs
Aldosterone antagonist diuretics,AADs
Alpha-1 blockers,A1Bs
Angiotensin receptor blockers (ARBs),ARBs
Beta blockers - cardioselective,cBBs
Beta blockers - cardioselective and vasodilatory,cvBBs
Beta blockers - combined alpha and beta receptor,aBBs
Beta blockers - intrinsic sympathomimetic activity,sBBs
Beta blockers - noncardioselective,nBBs
Dihydropyridine calcium channel blockers (dCCB),dCCBs
Direct vasodilators,DVs
Loop diuretics,LDs
Nodihydropyridine calcium channel blockers (ndCCB),nCCBs
Potassium sparing diurectics,KSDs
Thiazide or thiazide-like diuretics,TZs")
abbreviations <- abbr$shortName
names(abbreviations) <- abbr$name
```
```{r, loadData, echo=FALSE, message=FALSE, comment=FALSE, warning=FALSE, results='hide'}
if (is.null(params$load)) {
connectionDetails <- createConnectionDetails(dbms = "postgresql",
server = paste(Sys.getenv("shinydbServer"),
Sys.getenv("shinydbDatabase"),
sep = "/"),
port = Sys.getenv("shinydbPort"),
user = Sys.getenv("shinydbUser"),
password = Sys.getenv("shinydbPw"),
schema = Sys.getenv("shinydbSchema"))
connection <- connect(connectionDetails)
targetName <- getExposureName(connection = connection, exposureId = params$targetId)
comparatorName <- getExposureName(connection = connection, exposureId = params$comparatorId)
outcomeName <- getOutcomeName(connection = connection, outcomeId = params$outcomeId)
analyses <- getAnalyses(connection = connection)
databaseDetails <- getDatabaseDetails(connection = connection,
databaseId = params$databaseId)
studyPeriod <- getStudyPeriod(connection = connection,
targetId = params$targetId,
comparatorId = params$comparatorId,
databaseId = params$databaseId)
mainResults <- getMainResults(connection = connection,
targetIds = params$targetId,
comparatorIds = params$comparatorId,
outcomeIds = params$outcomeId,
databaseIds = params$databaseId,
analysisIds = c(1, 2, 3, 4))
controlResults <- getControlResults(connection = connection,
targetId = params$targetId,
comparatorId = params$comparatorId,
analysisId = 1,
databaseId = params$databaseId)
attrition <- getAttrition(connection = connection,
targetId = params$targetId,
comparatorId = params$comparatorId,
outcomeId = params$outcomeId,
analysisId = 1,
databaseId = params$databaseId)
followUpDist <- getCmFollowUpDist(connection = connection,
targetId = params$targetId,
comparatorId = params$comparatorId,
outcomeId = params$outcomeId,
databaseId = params$databaseId,
analysisId = 1)
balance <- getCovariateBalance(connection = connection,
targetId = params$targetId,
comparatorId = params$comparatorId,
databaseId = params$databaseId,
analysisId = 2)
popCharacteristics <- getCovariateBalance(connection = connection,
targetId = params$targetId,
comparatorId = params$comparatorId,
databaseId = params$databaseId,
analysisId = 1,
outcomeId = params$outcomeId)
ps <- getPs(connection = connection,
targetIds = params$targetId,
comparatorIds = params$comparatorId,
databaseId = params$databaseId)
kaplanMeier <- getKaplanMeier(connection = connection,
targetId = params$targetId,
comparatorId = params$comparatorId,
outcomeId = params$outcomeId,
databaseId = params$databaseId,
analysisId = 2)
DatabaseConnector::disconnect(connection)
originalTargetName <- targetName
originalComparatorName <- comparatorName
if (params$targetId < 5000) {
names <- unlist(strsplit(targetName, " & "))
newName <- paste(
sapply(names, function(name) {
if (name %in% names(abbreviations)) {
name <- abbreviations[name]
}
name
}), collapse = " & ")
targetName <- newName
}
if (params$comparatorId < 5000) {
names <- unlist(strsplit(comparatorName, " & "))
newName <- paste(
sapply(names, function(name) {
if (name %in% names(abbreviations)) {
name <- abbreviations[name]
}
name
}), collapse = " & ")
comparatorName <- newName
}
if (!is.null(params$save)) {
save(targetName, comparatorName, outcomeName, analyses, databaseDetails,
studyPeriod, mainResults, controlResults,
attrition, followUpDist, balance, popCharacteristics, ps, kaplanMeier,
file = params$save)
}
if (!is.null(params$save)) {
save(targetName, comparatorName, outcomeName, analyses, databaseDetails,
studyPeriod, mainResults, controlResults,
attrition, followUpDist, balance, popCharacteristics, ps, kaplanMeier,
originalTargetName, originalComparatorName,
file = params$save)
}
} else {
load(params$load)
}
targetName <- uncapitalize(targetName)
comparatorName <- uncapitalize(comparatorName)
outcomeName <- uncapitalize(outcomeName)
indicationName <- uncapitalize(params$indicationId)
databaseName <- databaseDetails$databaseName
minYear <- substr(studyPeriod$minDate, 1, 4)
maxYear <- substr(studyPeriod$maxDate, 1, 4)
coverage <- getCoverage(controlResults)
```
\dropcap{T}he Large-scale Evidence Generation and Evaluation in a Network of Databases (LEGEND) project aims to generate reliable evidence on the effects of medical interventions using observational healthcare data to support clinical decision making.
LEGEND follows ten guiding principles (see [Supporting Information](#suppinfo)); chief among these stand that we generate evidence at large-scale to achieve completeness and faciliate analysis of the overall distribution of effect size estimates across treatments and outcomes.
We also generate evidence consistently by applying a systematic approach across all research questions and disseminate evidence regardless on the estimates effects to avoid publication bias. These aims help overcome the questionable reliable of observational research \citep{schuemie2018improving}.
This LEGEND document reports the risk of `r outcomeName` between new users of `r targetName` and `r comparatorName` treated for `r params$indicationId`.
Worldwide, hypertension stands as a leading cause of mortality, with an increasing prevalence over the last two decades \citep{forouzanfar2017global}.
The 2017 American College of Cardiology (ACC) / American Heart Association (AHA) clinical practice guidelines define hypertension based on averaged blood pressure (BP) measured in a healthcare setting;
systolic BP between 130 - 139 mmHg or diastolic BP between 80 - 89 mmHg characterize stage 1 hypertension, and systolic BP $>$ 140 mmHg or diastolic BP $>$ 90 mmHg mark stage 2 hypertension \citep{whelton20182017}.
Elevated BPs contribute to approximately half of all stroke and ischemic heart disease deaths and 13\% of all forms of deaths globally \citep{who2009global}.
While antihypertensive therapies carry well-established benefits in reducing BP and the risk of major cardiovascular events, the health benefits and drug safety concerns of any one class of antihypertensive drugs relative to other classes as first-line therapy remains debatable.
Part of this debate arises from a paucity of large randomized controlled trials and observational studies providing head-to-head comparisons between all pairs of individual drugs and drug classes.
One notable counter-example is the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT) \citep{allhat2002major} that randomized 33,357 patients to receive either
amlodipine (a dihydropyridine calcium channel blocker),
chlorthalidone (a thiazide or thiazide-like diuertic) or
lisinopril (an angiotensin converting enzyme inhibitor).
However, ALLHAT employed only a single drug representative per class and did not include several other important antihypertensive classes, such as angiotensin receptor blockers and beta-blockers.
Reboussin et al.~provide a systematic review of randomized controlled trials examining comparative benefits and harms of various antihypertensives as first-line therapy \citep{reboussin2017systematic}.
Further observational study can help refine first-line therapy recommendations in terms of both treatment effecacy and relative drug safety, such as those from the 2017 ACC/AHA guidelines and the 2018 European Society of Cardiology (ESC) and European Society of Hypertension (ESH) Guidelines for the management of arterial hypertension\citep{williams20182018}.
# Methods
## Data source
We conduct a new-user cohort study comparing new users of `r targetName` with new users of `r comparatorName` in the `r databaseName` (`r params$databaseId`) database encoded in the Observational Medical Outcomes Partnership (OMOP) common data model (CDM) version 5
\citep{hripcsak2015observational,overhage2012validation,ryan2013empirical}.
`r sub(paste0(".*\\(",databaseDetails$databaseId,"\\)"), paste0("The ", databaseDetails$databaseId), databaseDetails$description)`
The study period spans from `r studyPeriod$minDate` to `r studyPeriod$maxDate`.
## Study design
This study follows a retrospective, observational, comparative cohort design \citep{ryan2013empirical}
.
We include patients who are first time users of `r targetName` or `r comparatorName`, and who have a diagnosis of `r indicationName` on or prior to treatment initation.
We require that patients have continuous observation in the database for at least 365 days prior to treatment initiation.
We exclude patients with prior `r outcomeName` events and less than 1 day at risk. `r if (indicationName %in% names(extraCriteria)) { extraCriteria[indicationName] }` Links to full cohort details, include concept codes, are provided in the [Supporting Information](#suppinfo).
The outcome of interest is `r outcomeName`.
We begin the outcome risk window 1 day after treatment initation and consider two design choices to define the window end.
First, we end the outcome time-at-risk window at first cessation of continuous drug exposure, analogous to an on-treatment design and, second, we end the outcome time-at-risk window when the patient is no longer observable in the database, analogous to an intent-to-treat design.
Continuous drug exposures are constructed from the available longitudinal data by considering sequential prescriptions that have fewer than 30 days gap between prescriptions.
## Statistical analysis
We conduct our cohort study using the open-source OHDSI CohortMethod R package \citep{schuemie2018cohortmethod}
, with large-scale analytics achieved through the Cyclops R package \citep{suchard2013massive}
.
We use propensity scores (PSs) -- estimates of treatment exposure probability conditional on pre-treatment baseline features in the one year prior to treatment initiation -- to control for potential measured confoudning and improve balance between the target (`r targetName`) and comparator (`r comparatorName`) cohorts \citep{rosenbaum1983central}
.
We use an expansive PS model that includes all available patient demographics, drug, condition and procedure covariates generated through the FeatureExtraction R package \citep{schuemie2018featureextration}
instead of a prespecified set of investigator-selected confounders.
We perform PS stratification or variable-ratio matching and then estimate comparative `r targetName`-vs-`r comparatorName` hazard ratios (HRs) using a Cox proportional hazards model.
Detailed covariate and methods informations are provided in the
[Supporting Information](#suppinfo).
We present PS and covariate balance metrics to assess successful confounding control, and provide HR estimates and Kaplan-Meier survival plots for the outcome of `r outcomeName`.
Residual study bias from unmeasured and systematic sources can exist in observational studies after controlling for measured confounding \citep{schuemie2014interpreting,schuemie2016robust}
.
To estimate such residual bias, we conduct negative control outcome experiments with `r length(unique(controlResults$outcomeName))` negative control outcomes
identified through a data-rich algorithm \citep{voss2017accuracy}.
We fit the negative control estimates to an empirical null distribution that characterizes the study residual bias and is an important artifact from which to assess the study design \citep{schuemie2018improving}
.
Using the empirical null distribution and synthetic positive controls \citep{schuemie2018empirical}
, we additionally calibrate all HR estimates, their 95\% confidence intervals (CIs) and the $p$-value to reject the null hypothesis of no differential effect (HR = 1).
Empirical calibration serves as an important diagnostic tool to evaluate if residual systematic error is sufficient to cast doubt on the accuracy of the unknown effect estimate.
# Results
## Population characteristics
```{r, attrition_plot, echo=FALSE, cache=FALSE, warning=FALSE, fig.align="center", fig.width=6, fig.height=10, out.width = '90%', fig.cap=paste("\\textbf{Attrition diagram for selecting new-users of", targetName, "and", comparatorName, "from the", params$databaseId, "database.\\label{fig:attrition}}")}
drawAttritionDiagram(attrition, targetName, comparatorName)
```
Figure \ref{fig:attrition} diagrams the inclusion of study subjects from the `r params$databaseId` database under the on-treatment with stratification design.
We augment these counts with cohort sizes we identify for the remaining designs in Table \ref{tab:power}.
This table also reports total patient follow-up time, numbers of `r outcomeName` events these patients experience and unadjusted incidence rates.
Table \ref{tab:demographics} compares base-line characteristics between patient cohorts.
\begin{table*}
\caption{\textbf{Patient cohorts.}
Target (T) cohort is `r targetName` new-users. Comparative (C) cohort is `r comparatorName` new-users.
We report total number of patients, follow-up time (in years), number of `r outcomeName` events, and event incidence rate (IR) per 1,000 patient years (PY) in patient cohorts, as well as the their minimum detectable relative risk (MDRR).
Note that the IR does not account for any stratification or matching.
}\label{tab:power}
\vspace*{-0.5em}
\centering{
\rowcolors{2}{gray!6}{white}
\begin{tabular}{lrrrrrrrrr}
\hiderowcolors
\toprule
&
\multicolumn{2}{c}{Patients} &
\multicolumn{2}{c}{PYs} &
\multicolumn{2}{c}{Events} &
\multicolumn{2}{c}{IR} &
\\
\cmidrule(lr){2-3} \cmidrule(lr){4-5} \cmidrule(lr){6-7} \cmidrule(lr){8-9}
\multicolumn{1}{c}{Design} &
\multicolumn{1}{c}{T} & \multicolumn{1}{c}{C} &
\multicolumn{1}{c}{T} & \multicolumn{1}{c}{C} &
\multicolumn{1}{c}{T} & \multicolumn{1}{c}{C} &
\multicolumn{1}{c}{T} & \multicolumn{1}{c}{C} &
\multicolumn{1}{c}{MDRR} \\
\midrule
\showrowcolors
```{r, outcomes, echo=FALSE, results="asis", cache=FALSE, warning=FALSE}
table <- preparePowerTable(mainResults, analyses)
print(xtable(table, format = "latex"),
include.rownames = FALSE,
include.colnames = FALSE,
hline.after = NULL,
only.contents = TRUE,
add.to.row = list(pos = list(nrow(table)), command = c("\\bottomrule")),
sanitize.text.function = identity)
```
\end{tabular}
}
\end{table*}
\begin{table}
\caption{\textbf{Patient demographics.} We report the standardized difference of population means (StdDiff) before and after stratification for selected base-line patient characteristics.}\label{tab:demographics}
\vspace*{-0.5em}
\centerline{
\resizebox{0.5\textwidth}{!}{
\rowcolors{2}{gray!6}{white}
\begin{tabular}{lrrrrrr}
\hiderowcolors
\toprule
& \multicolumn{3}{c}{Before stratification}
& \multicolumn{3}{c}{After stratification} \\
\cmidrule(lr){2-4} \cmidrule(lr){5-7}
\multicolumn{1}{c}{Characteristic}
& \multicolumn{1}{c}{T (\%)}
& \multicolumn{1}{c}{C (\%)}
& \multicolumn{1}{c}{StdDiff}
& \multicolumn{1}{c}{T (\%)}
& \multicolumn{1}{c}{C (\%)}
& \multicolumn{1}{c}{StdDiff} \\
\midrule
\showrowcolors
```{r, features, echo=FALSE, results="asis", cache=FALSE, warning=FALSE}
table <- prepareTable1(balance, pathToCsv = file.path(params$root, "Table1Specs.csv"))
table <- table[3:nrow(table),]
print(xtable(table, format = "latex", align = c("l","l","r","r","r","r","r","r")),
include.rownames = FALSE,
include.colnames = FALSE,
hline.after = NULL,
only.contents = TRUE,
add.to.row = list(pos = list(nrow(table)), command = c("\\bottomrule")),
sanitize.text.function = identity)
```
\end{tabular}
}
}
\end{table}
## Patient characteristics balance
```{r, make_ps, echo=FALSE, warning=FALSE, fig.align='center', fig.width=5, fig.height=5, out.width='70%', fig.cap=paste("\\textbf{Preference score distribution for", targetName, "and", comparatorName, "new-users.}", "The preference score is a transformation of the propensity score that adjusts for prevalence differences between populations. A higher overlap indicates that subjects in the two populations are more similar in terms of their predicted probability of receiving one treatment over the other.\\label{fig:ps}")}
plotPs(ps, targetName, comparatorName)
```
Figure \ref{fig:ps} plots the preference score distributions, re-scalings of PS estimates to adjust for differential treatment prevalences, for patients treated with `r targetName` and `r comparatorName`.
We assess characteristics balance achieved through PS adjustment by comparing all characteristics' standardized difference (StdDiff) between treatment group means before and after PS trimming and stratification (Table \ref{tab:demographics}).
Figure \ref{fig:balance} plots StdDiff for all `r nrow(balance)` base-line patient features that serve as input for the PS model.
Before stratification, `r sum(na.omit(balance$beforeMatchingStdDiff) > 0.1)` features have a StdDiff $> 0.1$. After stratification, the count is `r sum(na.omit(balance$afterMatchingStdDiff) > 0.1)`.
<!-- (TODO) Add judgement statement about measured confounding control using something like: `r judgePropensityScore(1.1,1.1)` -->
```{r, balance, echo=FALSE, warning=FALSE, warning=FALSE, fig.align='center', fig.width=5, fig.height=5, out.width='70%', fig.cap="\\textbf{Patient characteristics balance before and after stratification.} As a rule-of-thum, all values $<0.1$ is generall considered well-balance \\citep{austin2009balance}."}
plotCovariateBalanceScatterPlot(balance,
beforeLabel = "Before stratification",
afterLabel = "After stratification")
```
## Outcome assessment
\begin{table*}
\caption{Time-at-risk distributions as percentiles in the target and comparator cohorts after stratification.}
\label{tab:fu}
\centering{
\rowcolors{2}{gray!6}{white}
\begin{tabular}{crrrrrrr}
\hiderowcolors
\toprule
&
\multicolumn{1}{c}{min} &
\multicolumn{1}{c}{10\%} &
\multicolumn{1}{c}{25\%} &
\multicolumn{1}{c}{50\%} &
\multicolumn{1}{c}{75\%} &
\multicolumn{1}{c}{90\%} &
\multicolumn{1}{c}{max} \\
\midrule
\showrowcolors
```{r, fu, echo=FALSE, results='asis', warning=FALSE}
table <- prepareFollowUpDistTable(followUpDist)
table$Cohort <- c(targetName, comparatorName)
print(xtable(table, format = "latex"),
include.rownames = FALSE,
include.colnames = FALSE,
hline.after = NULL,
only.contents = TRUE,
add.to.row = list(pos = list(nrow(table)), command = c("\\bottomrule")),
sanitize.text.function = identity)
```
\end{tabular}
}
\end{table*}
Table \ref{tab:fu} details the time to first `r outcomeName` or censoring distributions for patients in the `r targetName` and `r comparatorName` cohorts.
We report in Table \ref{tab:hr} estimated HRs comparing `r targetName` to `r comparatorName` for the on-treatment and intent-to-treat designs with stratification or matching.
Figure \ref{fig:km} plots Kaplan-Meier survival curves for patients under the intent-to-treat design.
```{r, km, echo=FALSE, results='hide', fig.width=6, fig.height=6, fig.align='center', out.width='80%', fig.cap=paste0("\\textbf{Kaplan Meier plot of ", outcomeName, "-free survival.} This plot is adjusted for the propensity score stratification; the ", targetName, " curve shows the actual observed survival. The ", comparatorName, " curve applies reweighting to approximate the counterfactual of what ", targetName, " survival would look like had the ", targetName, " cohort been exposed to ", comparatorName, " instead. The shaded area denotes the 95\\% CI.")}
plotKm <- plotKaplanMeier(kaplanMeier, targetName, comparatorName)
```
\begin{table*}
\caption{
Hazard ratio (HR) estimates and their confidence intervals (CIs) and $p$-value to reject the null hypothesis of no difference (HR = 1) under various designs.
}
\label{tab:hr}
\centering{
\rowcolors{2}{gray!6}{white}
\begin{tabular}{lrrrr}
\hiderowcolors
\toprule
& \multicolumn{2}{c}{Uncalibrated} & \multicolumn{2}{c}{Calibrated} \\
\cmidrule(lr){2-3} \cmidrule(lr){4-5}
\multicolumn{1}{c}{Design}
& \multicolumn{1}{c}{HR (95\% CI)} & \multicolumn{1}{c}{$p$}
& \multicolumn{1}{c}{HR (95\% CI)} & \multicolumn{1}{c}{$p$} \\
\midrule
\showrowcolors
```{r, result_table, echo=FALSE, results='asis'}
table <- mainResults
table$hr <- sprintf("%.2f (%.2f - %.2f)", mainResults$rr, mainResults$ci95Lb, mainResults$ci95Ub)
table$p <- sprintf("%.2f", table$p)
table$calHr <- sprintf("%.2f (%.2f - %.2f)", mainResults$calibratedRr, mainResults$calibratedCi95Lb, mainResults$calibratedCi95Ub)
table$calibratedP <- sprintf("%.2f", table$calibratedP)
table <- merge(table, analyses)
table <- table[, c("description", "hr", "p", "calHr", "calibratedP")]
print(xtable(table),
include.rownames = FALSE,
include.colnames = FALSE,
hline.after = NULL,
only.contents = TRUE,
add.to.row = list(pos = list(nrow(table)), command = c("\\bottomrule")),
sanitize.text.function = identity)
```
\end{tabular}
}
\end{table*}
## Residual systematic error
In the absense of bias, we expect 95\% of negative and positive control estimate 95\% confidence intervals to include their presumed HR. In the case of negative controls, the presumed HR = 1. Figure \ref{fig:negatives} describes the negative and positive control estimates under the on-treatment with PS stratification design.
Before calibration, negative and positive controls demonstrate `r judgeCoverage(coverage[coverage$group == "Uncalibrated", "coverage"])` coverage. After calibration, controls demonstrate `r judgeCoverage(coverage[coverage$group == "Calibrated", "coverage"])` coverage.
```{r, make_error, echo=FALSE, warning=FALSE}
plot <- plotScatter(controlResults)
suppressMessages(ggsave(paste0(opts_chunk$get("fig.path"), "error.pdf"), plot,
width = 14, height = 4, units = "in"))
```
\begin{figure*}
\centerline{
\includegraphics[width=1.0\textwidth]{`r opts_chunk$get("fig.path")`error}
}
\caption{
\textbf{Evaluation of effect estimation between `r targetName` and `r comparatorName` new-users}. The top plots HRs and their corresponding standard errors before calibration for each negative and synthetic positive control. The bottom plots the same estimates after calibration.
}
\label{fig:negatives}
\end{figure*}
# Conclusions
We find that `r targetName` has a `r judgeHazardRatio(mainResults[params$primary,"calibratedCi95Lb"], mainResults[params$primary,"calibratedCi95Ub"])` risk of `r outcomeName` as compared to `r comparatorName` within the population that the `r params$databaseId` represents.
# Supporting Information {#suppinfo}
Here we enumerate the guiding principles of LEGEND and provide linking details on study cohorts and design.
## LEGEND principles
\begin{enumerate}[noitemsep]
\item Evidence will be generated at large-scale.
\item Dissemination of the evidence will not depend on the estimated effects.
\item Evidence will be generated using a pre-specified analysis design.
\item Evidence will be generated by consistently applying a systematic approach across all research questions.
\item Evidence will be generated using best-practices.
\item Evidence generation process will be empirically evaluated by including control research questions where the true effect size is known.
\item Evidence will be generated using open source software that is freely available to all.
\item LEGEND will not be used to evaluate methods.
\item Evidence will be generated in a network of heterogeneous databases.
\item No patient-level data will be shared between sites in the network, only aggregated data.
\end{enumerate}
## Study cohorts
Please see the LEGEND `r indicationName` Study protocol (https://github.com/OHDSI/Legend/tree/master/Documents) for complete specification of the `r targetName`, `r comparatorName` and `r outcomeName` cohorts using ATLAS (http://www.ohdsi.org/web/atlas).
## Negative controls
We selected negative controls using a process similar to that outlined by \cite{voss2017accuracy}.
We first construct a list of all conditions that satisfy the following criteria with respect to all drug exposures in the LEGEND `r indicationName` study:
\begin{itemize}[noitemsep]
\item No Medline abstract where the MeSH terms suggests a drug-condition association \citep{winnenburg2015leveraging},
\item No mention of the drug-condition pair on a US product label in the ``Adverse Drug Reactions'' or ``Postmarketing'' section \citep{duke2013consistency},
\item No US spontaneous reports suggesting that the pair is in an adverse event relationship \citep{evans2001use,banda2016curated},
\item OMOP vocabulary does not suggest that the drug is indicated for the condition,
\item Vocabulary conditional concepts are usable (i.e., not too broad, not suggestive of an adverse event relationship, no pregnancy related), and
\item Exact condition concept itself is used in patient level data.
\end{itemize}
```{r, echo=FALSE}
negatives <- controlResults[controlResults$effectSize == 1.0 & !is.na(controlResults$rr), ]
```
We optimize remaining condition concepts, such that parent concepts remove children as defined by the OMOP vocabulary and perform manual review to exclude any pairs that may still be in a causal relationship or too similar to the study outcome.
For `r indicationName`, this process led to a candidate list of `r negativeControls[indicationName]` negative controls for which table can be found in study protocol (https://github.com/OHDSI/Legend/tree/master/Documents).
In the comparison of `r targetName` and `r comparatorName` in the `r params$databaseId` database, `r nrow(negatives)` negative controls had sufficient outcomes to return estimable HRs. We list these conditions in Table \ref{tab:negatives}
\begin{table}
\caption{Negative controls employed in the comparison of `r targetName` and `r comparatorName` in the `r params$databaseId` database.}
\label{tab:negatives}
\centering{
\rowcolors{2}{gray!6}{white}
\begin{tabular}{l}
\hiderowcolors
\toprule
\multicolumn{1}{c}{Condition} \\
\midrule
\showrowcolors
```{r, echo=FALSE, results='asis'}
table <- data.frame(outcomeName = sort(negatives[,"outcomeName"]))
print(xtable(table),
include.rownames = FALSE,
include.colnames = FALSE,
hline.after = NULL,
only.contents = TRUE,
add.to.row = list(pos = list(nrow(table)), command = c("\\bottomrule")),
sanitize.text.function = identity)
```
\end{tabular}
}
\end{table}
## Covariate sets
\begin{itemize}[noitemsep]
\item Demographics (age in 5-year bands, gender, index year, index month)
\item Conditions (condition occurrence in lookback window)
\begin{itemize}[noitemsep]
\item in 365 days prior to index date
\item in 30 days prior to index date
\end{itemize}
\item Condition aggregation
\begin{itemize}[noitemsep]
\item SMOMED
\end{itemize}
\item Drugs (drug occurrence in lookback window)
\begin{itemize}[noitemsep]
\item in 365 days prior to index date
\item in 30 days prior to index date
\end{itemize}
\item Drug aggregation
\begin{itemize}[noitemsep]
\item Ingredient
\item ATC class
\end{itemize}
\item Risk Scores (Charlson comorbidity index)
`r if (indicationName %in% names(extraCovariates)) { paste("\\item", extraCovariates[indicationName]) }`
\end{itemize}
We exclude all covariates that occur in fewer than 0.1\% of patients within the target and comparator cohorts prior to model fitting for computational efficiency.