pass_in_R/p_tests.qmd at main · uvastatlab/pass_in_R · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
title: "Proportion tests"
---

## Two-sample proportion test

For sample size estimation based on power, we need the following:

- proportion in one group
- proportion in other group
- power of test
- significance level of test
- direction of test: one or two-sided

For power estimation based on sample size, we need the following:

- proportion in one group
- proportion in other group
- sample size in each group
- significance level of test
- direction of test: one or two-sided


### Example: sample size

We wish to plan an experiment to test if there is a difference in the proportion of male and female college undergraduate students who floss daily. Our null hypothesis is no difference in the proportion that answer yes. We want to sample enough students to detect a difference of at least 5%. Assume the following:

- proportion of one group is 0.30
- proportion of the other group is 0.25
- power of 0.9
- significance level of 0.05
- two-sided test

Using base R [@R]:

```{r}
power.prop.test(p1 = 0.30, p2 = 0.25, sig.level = 0.05, power = 0.9,
                alternative = "two.sided")
```

We need to observe 1674 students in each group.

Notice the sample size decreases if assume proportions are closer to 0 or 1. For example, assume 0.05 in one group and 0.10 for the other.

```{r}
power.prop.test(p1 = 0.05, p2 = 0.10, sig.level = 0.05, power = 0.9,
                alternative = "two.sided")
```

Now we need to observe 582 in each group.

Using the pwr package [@pwr], we need to express the difference in proportions as an effect size using `ES.h()`:

```{r}
library(pwr)
pwr.2p.test(h = ES.h(p1 = 0.05, p2 = 0.10), sig.level = 0.05, power = 0.9,
            alternative = "two.sided")
```

We need to observe 568 in each group. The result does not match the base R function because they each calculate effect size differently.

### Example: power

We wish to plan an experiment to test if there is a difference in the proportion of male and female college undergraduate students who floss daily. Our null hypothesis is no difference in the proportion that answer yes. We want to detect a difference of at least 5%. What is the power of our experiment if we know in advance we will be able to sample 300 males and females each? Assume the following:

- proportion of one group is 0.30
- proportion of the other group is 0.25
- sample size per group of 300
- significance level of 0.05
- two-sided test

Using base R [@R]:

```{r}
power.prop.test(p1 = 0.30, p2 = 0.25, sig.level = 0.05, n = 300,
                alternative = "two.sided")
```

The power of this test will only be about 0.28 if our assumptions are true.

The `pwr.2p.test()` function from the pwr package returns the same answer.

```{r}
pwr.2p.test(h = ES.h(p1 = 0.25, p2 = 0.30), sig.level = 0.05, n = 300,
            alternative = "two.sided")
```


## Difference in proportions

For sample size estimation based on precision, we need the following:

- proportion in one group
- proportion in other group
- desired width of confidence interval

### Example

We wish to plan an experiment to test if there is a difference in the proportion of male and female college undergraduate students who floss daily. If there truly is a difference of 0.05 in the population, we would like to estimate it within 0.025. That implies estimating a confidence interval with a width of 0.05.  Assume the following:

- proportion of one group is 0.30
- proportion of the other group is 0.25
- confidence interval width of 0.05

Using the `prec_riskdiff()` function from the presize package [@presize], we can calculate this as follows. (The Newcombe method is the default method.)


```{r}
library(presize)
prec_riskdiff(p1 = 0.30, p2 = 0.25, conf.width = 0.05, method = "newcombe")
```

To estimate a difference in proportions with this precision, assuming the group proportions are each 0.30 and 0.25, we need to sample 2442 subjects in each group.

Once again, assumed proportions closer to 0 or 1 require smaller samples:

```{r}
prec_riskdiff(p1 = 0.05, p2 = 0.10, conf.width = 0.05, method = "newcombe")
```

Larger confidence widths also require smaller sample sizes. For example, to estimate a difference in proportions within 0.05 (i.e., confidence width of 0.1):

```{r}
prec_riskdiff(p1 = 0.05, p2 = 0.10, conf.width = 0.1, method = "newcombe")
```