-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path10_species_dominance.rmd
345 lines (279 loc) · 11.9 KB
/
10_species_dominance.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
---
title: Species dominance
bibliography: [reference/a972.bib]
---
```{r setup, include=FALSE}
knitr::knit_hooks$set(time_it = time_it)
knitr::opts_chunk$set(echo = TRUE, warning = TRUE, message = FALSE, time_it = TRUE, cache.extra = "wombat")
```
```{r 10-species-dominance-1}
library(tidyverse)
```
I'm interested in creating a measure of species dominance. I will select the 4 largest trees per plot and then determine what proportion of these are from each species. I will only consider and report on RW and DF. I could augment this by showing the average size of the dominant tree by species.
For this two function, the four highest ranking trees (height, growth, or diameter) were selected from each plot these were aggregated at the treatment level to determine the proportion of dominant trees in each species.
If I want to do hypothesis testing, I need to have two datasets, one of the raw data (counts of trees per plot) and one of the average proportion. I may want to do this with all of my metrics. The final aggregation may be able to be done with a single function, depending on how data were aggregated for each metric. Qmd is the only metric where the mean is not straight forward, as it is the square root of the mean of squared diameters.
For this to make sense for heights, we will need to predict heights for missing trees because the largest trees are missing heights in many cases.
Although I refer to proportions here, this is of 100 TPH, so multiplying by 100 gives TPH.
```{r 10-species-dominance-2}
spp_dom_raw <- function(data = d_l, metrics = NULL, years = NULL) {
if(is.null(years)) years <- c("08", "13", "18")
if(is.null(metrics)) metrics <- c("dbh", "ht_p", "ba_inc2")
raw <- data %>%
filter(
spp %in% c("SESE3", "PSMEM"),
live | status %in% c(15, 16),
year %in% years
) %>%
select(year, treatment, spp, plot, all_of(metrics)) %>%
pivot_longer(cols = all_of(metrics), names_to = "measure") %>%
group_by(measure, year, treatment, plot) %>%
slice_max(order_by = value, n = 8, with_ties = FALSE) %>%
ungroup()
missing <- raw$measure[raw$value == 0 | is.na(raw$value)]
if(length(missing) > 0) {
message <- "Measures missing values (entries removed):"
warning(paste(message, toString(unique(missing))))
}
filter(raw, value != 0)
}
spp_dom_cnt <- function(...) {
spp_dom_raw(...) %>%
count(measure, year, treatment, plot, spp) %>%
ungroup() %>%
complete(nesting(treatment, year, plot), spp, measure, fill = list(n = 0))
}
spp_dom_pct <- function(...) {
spp_dom_cnt(...) %>%
group_by(measure, treatment, year, spp) %>%
summarise(n = sum(n)) %>%
mutate(freq = n / sum(n)) %>%
arrange(measure, treatment, spp, year) %>%
ungroup()
}
```
## Dominance figure
```{r 10-species-dominance-3, fig.height=3.5, fig.width=3.5}
dominance_labels <- as_labeller(
c(
ba_inc2 = "BAI",
dbh = "DBH",
ht_p = "Height"
),
label_parsed
)
make_dominance_fig <- function(...) {
spp_dom_pct(...) %>%
relevel_treatment() %>%
mutate(
measure = factor(measure, levels = c("dbh", "ht_p", "ba_inc2")),
year = case_when(
year == 13 & measure == "ba_inc2" ~2010.5,
year == 18 & measure == "ba_inc2" ~2015.5,
year == "init" ~2003,
year == "08" ~2008,
year == "13" ~2013,
year == "18" ~2018
)
) %>%
ggplot(aes(year, freq, fill = spp)) +
geom_bar(position = "stack", stat = "identity", width = 4.5) +
facet_grid(
vars(measure),
vars(treatment),
labeller = labeller(measure = dominance_labels)
) +
theme_bw() +
theme(
panel.spacing.x = unit(1.3, "mm"),
legend.position = "bottom",
legend.title = element_blank(),
legend.key.size = unit(4, "mm"),
legend.box.spacing = element_blank(),
panel.grid = element_blank(),
# strip.background.y = element_blank(),
strip.placement = "outside",
axis.text.x = element_text(angle = 90, vjust = 0.25)
) +
# scale_x_continuous(breaks = yr_breaks) +
scale_y_continuous(breaks = c(0, .5, 1)) +
scale_x_continuous(
breaks = c(2003, 2008, 2013, 2018),
labels = c("Pre-trt", "2008", "2013", "2018")) +
scale_fill_manual("spp",
values = c(SESE3 = "black", PSMEM = "#a0a0a0"),
labels = c("Redwood", "Douglas-fir")
) +
labs(y = "Proportion", x = "Year")
}
make_dominance_fig(years = c("init", "08", "13", "18"))
ggsave(
filename = "figs/dominance_fig.pdf",
device = cairo_pdf,
width = 8.84,
height = 9,
units = "cm"
)
ggsave(
filename = "figs/dominance_fig.jpg",
width = 8.84,
height = 9,
units = "cm"
)
```
Now I need to display the average values for these dominant species, perhaps along with their standard deviations. I think I'll just do this for each treatment in the most recent measurement period.
```{r 10-species-dominance-4}
spp_dom_raw() %>%
group_by(year, treatment, measure) %>%
summarise(average = mean(value)) %>%
arrange(measure, treatment, year) %>%
pivot_wider(names_from = treatment, values_from = average) %>%
relocate(measure) %>%
kbl2(
caption = "Average metrics for top 100 TPH (ranked by the respective metric)",
digits = 2)
```
## Species dominance significance test
A Wilcoxon rank-sum test could be used to test if one treatment results in changes that are significantly greater than another treatment. Unfortunately, it seems like none of the treatments that show an average increase in the number of dominant SESE are significant at the alpha = 0.05 level, using a 1-sided test. One-sided permutation tests shows even larger p-values, and these I would expect to be more accurate given the number of zeros in the data.
To start, I will only test significance for dbh ranked dominance, to see if treatment effect is significant there. first I get a list of differences from 2008 to 2018 for each plot in a treatment, this represents the treatments effect on redwood dominance. Then I compare the treatments to see is ones with a larger average effect, are significantly larger than those with a smaller average effect: 1-sided non-parametric tests. Wilcox rank-sum has lower p-values, than the permutation test. The data has a lot of zeros, and very few data points, none of the one-sided differences are significant. It is too soon to tell if redwood dominance has been affected.
```{r 10-species-dominance-5}
# This is the permutation calculation
new_diff <- function(d_null) {
new_samp <- sample(d_null)
mean(new_samp[5:8]) - mean(new_samp[1:4])
}
perm_test <- function(x, y, M = 1e5) {
obs_diff <- mean(x) - mean(y)
d_null <- c(x, y)
perm_diff <- replicate(M, new_diff(d_null))
sum(perm_diff >= obs_diff) / M
}
# Takes one of my dominance metrics: dbh, ht, ba_inc2
non_para_dom_test <- function(metric) {
# only 2013 and 2018 available for basal area increment
if(metric == "ba_inc2") years <- c("13", "18") else years <- c("08", "18")
# summarize differences between 2018 and 2013 for each treatment
# (plot differences)
W1 <- spp_dom_cnt(metrics = metric, years = years) %>%
filter(spp == "SESE3") %>%
group_by(plot) %>%
mutate(diff = n - lag(n)) %>%
drop_na() %>%
ungroup() %>%
select(treatment, measure, diff)
W <- with(W1, split(diff, treatment))
# Get all combinations of treatments, where average change of treatment 1
# is greater than the average for treatment 2
combos <- W1 %>%
group_by(treatment, measure) %>%
nest_by() %>%
mutate(avg = map_dbl(data, mean)) %>%
expand_grid(a = ., b = .) %>%
mutate(diff = a$avg - b$avg) %>%
arrange(desc(diff)) %>%
filter(diff > 0) %>%
transmute(treat.a = a$treatment, treat.b = b$treatment)
print(paste0("Test for ", metric, ":"))
apply(combos, 1, function(x) {
wilout <- wilcox.test(W[[ x[1] ]], W[[ x[2] ]], alternative = "g")
permout <- perm_test(W[[ x[1] ]], W[[ x[2] ]])
list(test = paste(x[1], ">", x[2]), Wilcox.P = wilout$p.value, Perm.P = permout)
}) %>%
bind_rows()
}
```
```{r 10-species-dominance-6, warning=FALSE, cache=TRUE}
non_para_dom_test("dbh")
non_para_dom_test("ba_inc2")
non_para_dom_test("ht")
```
# Hui's dominance metric
I first saw this metric in @motzSamplingMeasuresTree2010, where it appears to have been missapplied. There, it was given as:
$$ U_i = \frac{1}{n} \sum^{n}_{j=1} 1 (DBH_i > DBH_j) $$
But they applied it to all trees, in which case, I assume that, leaving aside the issue to ties, the average would have to be 0.5.
Here I will calculate the species dominance metric separately for redwood and Douglas-fir components.
A stand composed of a single species should have an average Ui of 0.5
A species with more shade tolerance might have a lower Ui, because there are more trees in the understory, brining down average Ui.
Ui for redwood decreases in the control. This would suggest that the number of trees of other species out-competing (by diameter growth) redwoods is increasing, assuming the 4 nearest trees are in competition with any given tree. Because this metric is expressed as an average, it doesn't account for the density of the species.
```{r 10-species-dominance-7}
# Use live trees
dd <- d_l |>
filter(
live,
year != "init"
) |>
mutate(year = factor(year, ordered = FALSE)) |>
droplevels()
# calculate Ui for one observation, given observation number, dbh list and neighbors list
Ui_tree <- function(x, dbh, k_indices) {
focus_dbh <- dbh[x]
neighbors_dbh <- dbh[ k_indices[[x]] ]
mean(focus_dbh > neighbors_dbh)
}
# Ui for all observations
calc_Ui <- function(x, y, dbh) {
xy <- data.frame(x = x, y = y)
k_indices <- nearest(xy, k = 4)
sapply( 1:nrow(xy), Ui_tree, dbh, k_indices)
}
Ui_plot <- dd |>
group_by(treatment, year, plot) |>
mutate(val = calc_Ui(x, y, dbh)) |>
group_by(treatment, year, plot, spp) |>
summarize(val = mean(val)) |>
filter(spp %in% c("PSMEM", "SESE3"))
lmer(val ~ treatment * year * spp + (1 | plot), data = Ui_plot) |>
emmeans( ~ treatment + year + spp) |> as.data.frame() |>
relevel_treatment() |>
mutate(
spp = factor(spp, levels = c("SESE3", "PSMEM"))
) |>
ggplot(aes(year, emmean, color = spp, group = spp)) +
geom_line(size = 1, position = my_dodge) +
geom_point(position = my_dodge) +
geom_errorbar(
aes(ymin = emmean - SE, ymax = emmean + SE),
width = .75,
position = my_dodge
) +
theme_bw() +
facet_wrap( ~ treatment)
```
First, I'll show the average number of trees and standard deviation of each species in the top 100 TPH.
```{r 10-species-dominance-8}
dd |>
group_by(treatment, year, plot) |>
slice_max(n = 8, order_by = dbh, with_ties = TRUE) |>
group_by(treatment, year, spp) |>
count(plot) |>
summarize(avg_cnt = mean(n), sd = sd(n)) |>
filter(spp %in% c("SESE3", "PSMEM")) |>
color_groups(digits = 2, caption = "Average count and sd of number of DF or RW trees in the to 100 TPH per treatment/year")
```
Here I plot the average Ui of RW or DF trees that are in the top 100 TPH.
```{r 10-species-dominance-9}
Ui_plot2 <- dd |>
group_by(treatment, year, plot) |>
mutate(val = calc_Ui(x, y, dbh)) |>
slice_max(n = 8, order_by = dbh, with_ties = TRUE) |>
group_by(treatment, year, plot, spp) |>
summarize(val = mean(val)) |>
filter(spp %in% c("PSMEM", "SESE3"))
Ui_mod2 <- lmer(val ~ treatment * year * spp + (1 | plot), data = Ui_plot2)
emmeans(Ui_mod2, pairwise ~ spp | treatment + year)$contrasts
emmeans(Ui_mod2, ~ treatment + year + spp) |>
as.data.frame() |>
relevel_treatment() |>
mutate(
spp = factor(spp, levels = c("SESE3", "PSMEM"))
) |>
ggplot(aes(year, emmean, color = spp, group = spp)) +
geom_line(size = 1, position = my_dodge) +
geom_point(position = my_dodge) +
geom_errorbar(
aes(ymin = emmean - SE, ymax = emmean + SE),
width = .75,
position = my_dodge
) +
theme_bw() +
facet_wrap( ~ treatment)
```