-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtmo_survey_analysis.Rmd
317 lines (266 loc) · 11.4 KB
/
tmo_survey_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
---
title: "TMO Survery Results"
author: "Frank Bertsch"
date: "October 11, 2016"
output: html_document
---
This document outlines the results from the TMO (telemetry.mozilla.org) survey which went out to all of firefox-dev in late September. Our goal is to determine user satisfaction (where our users are internal to Mozilla), identify difficulties, and develop next steps for our platform.
```{r setup, include=FALSE}
library(dplyr)
library(ggplot2)
library(tidyr)
knitr::opts_chunk$set(echo = TRUE)
dat <- read.csv('tmo_survey_results.csv', stringsAsFactors = F) %>%
rename(has_used = Have.you.used.any.of.the.various.telemetry.mozilla.org.sites.at.some.point.during.the.last.6.months.,
user_type = Are.you.a.Mozilla.employee.or.a.volunteer.,
role = What.is.your.role.within.the.Mozilla.community.,
what_prevented = What.has.prevented.you.from.using.TMO.,
knows_help = Do.you.know.who.to.ask.for.help.or.where.the.documentation.for.TMO.lives.,
knows_help_2 = Do.you.know.who.to.ask.for.help.or.where.the.documentation.for.TMO.lives..1,
what_cause_to_use = What.would.make.you.more.likely.to.use.TMO.in.the.future.,
frequency = How.frequently.do.you.use.TMO.or.one.of.the.related.sites.,
products = What.features.of.TMO.have.you.used.,
questions = What.question.were.you.trying.to.answer.when.using.TMO.,
satisfied = Overall..were.you.satisfied.with.your.experience.,
timely_response = If.you.ran.into.problems..were.you.able.to.get.help.in.a.satisfactory.amount.of.time.,
like = What.do.you.like.about.about.TMO.,
dislike = What.do.you.dislike.about.TMO.,
comments = Do.you.have.any.other.comments.,
followup = If.you.would.like.someone.to.follow.up.with.you.directly..please.provide.your.email.address.) %>%
mutate(knows_help = mapply(function(a,b) if(a == "") b else a, knows_help, knows_help_2)) %>%
select(-knows_help_2)
```
#Baseline Results
First compare number of uses of data products. Volunteer usage is lower in the survey.
```{r, echo=FALSE}
ggplot(dat, aes(x = has_used, fill=user_type)) +
geom_bar(stat="count") +
ggtitle("Total Usage")
```
Next break up usage per product, by role. Here we can see the engineers use TMO (histograms, evolutions), and do some custom analysis on [as]tmo, while the managers mainly use stmo.
```{r, echo=FALSE}
dat %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
filter(role %in% c("Engineering", "Product/Project Management")) %>%
ggplot(aes(x = products)) +
geom_bar(stat="count") +
facet_wrap(~ role, ncol=2) +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1)) +
ggtitle("Usage per product - role")
```
Why haven't people used these tools? Might be because they don't know enough about the platform.
```{r, echo=FALSE}
ggplot(dat, aes(x = has_used, fill=knows_help)) +
geom_bar(stat="count") +
ggtitle("Comparison of those who use the platform and those who don't") +
xlab("Have they used the platform?")
```
No matter what platform the user uses, it seems equally likely they don't know how to get help.
```{r, echo=FALSE}
dat %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
ggplot(aes(x = products, fill = knows_help)) +
geom_bar(stat="count") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1))
```
How often do people use our platform?
```{r, echo=FALSE}
dat %>%
mutate(frequency = strsplit(frequency, ", ")) %>%
unnest(frequency) %>%
ggplot(aes(x = frequency)) +
geom_bar(stat="count") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1)) +
ggtitle("Histogram of frequency of use of data platform (any product)")
```
How many products do they use? It seems most people branch out a bit to find what they need.
```{r, echo=FALSE}
dat %>%
mutate(products = strsplit(products, ", ")) %>%
mutate(product_use_count = sapply(products, length)) %>%
mutate(product_use_count = sapply(product_use_count, function(p) if (p == 0) {"0"} else if (p == 1) {"1"} else {"2+"})) %>%
ggplot(aes(x = product_use_count)) +
geom_bar(stat="count") +
ggtitle("Histogram of number of products used per person") + xlab("Number of Products Used")
```
What is the frequency of use across products? The heaviest consistent users are custom dashboards and stmo, while histograms/evolutions are used with less frequency. We need to determine if this is because it's difficult to get what you're looking for using histograms/evolutions (and thus avoided), or if there's simply not a need to use histograms/evolutions as often.
```{r, echo=FALSE}
dat %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
mutate(frequency = strsplit(frequency, ", ")) %>%
unnest(frequency) %>%
mutate(frequency = factor(frequency, levels = c("Daily", "Once or twice a week", "Once or twice a month", "Sporadically"))) %>%
ggplot(aes(x = frequency)) +
geom_bar(stat="count") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1)) +
facet_wrap(~ products, ncol = 3) +
xlab("Frequency of Use")
```
How satisfied were users overall?
```{r, echo=FALSE}
dat %>% filter(satisfied != "", !is.na(satisfied)) %>%
ggplot(aes(x = satisfied)) +
geom_bar() +
ggtitle("Were you satisfied?")
```
How satisfied are people with our products? Seems custom dashboards and stmo are the least satisifed, but part of that might be because they get used more often, and these heavy users expect more. There is also not much support for custom dashboards.
```{r, echo=FALSE}
dat %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
group_by(products) %>%
summarize(fraction_satisfied = sum(grepl("Yes", satisfied)) / length(satisfied)) %>%
ungroup() %>%
arrange(fraction_satisfied) %>%
ggplot(aes(x = products, y = fraction_satisfied)) +
geom_bar(stat="identity") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1)) +
ylab("Fraction Satisfied")
```
What questions are people answering?
```{r, echo=FALSE}
dat %>%
mutate(questions = strsplit(questions, ", ")) %>%
unnest(questions) %>%
group_by(questions) %>%
summarize(count = length(questions)) %>%
filter(count > 1) %>%
ggplot(aes(x = questions, y = count)) +
geom_bar(stat="identity") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1))
```
Are they using different features per question?
Interesting that in every group, histograms are the most used product. Metrics data has stmo tied for second, with atmo fourth. Regression alerting focuses on TMO, with automated alerting coming in third, which makes sense.
```{r, echo=FALSE}
allowed_questions <- dat %>%
mutate(questions = strsplit(questions, ", ")) %>%
unnest(questions) %>%
group_by(questions) %>%
summarize(count = length(questions)) %>%
filter(count > 1) %>%
select(questions)
allowed_products <- dat %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
group_by(products) %>%
summarize(count = length(products)) %>%
filter(count > 1) %>%
select(products)
dat %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
mutate(questions = strsplit(questions, ", ")) %>%
unnest(questions) %>%
filter(sapply(questions, grepl, x=allowed_questions), sapply(products, grepl, x=allowed_products)) %>%
ggplot(aes(x = products)) +
geom_bar(stat="count") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1)) +
facet_wrap( ~ questions, ncol = 3)
```
The same data, just flipped products and questions, to see what questions people are asking per product:
```{r, echo=FALSE}
#see http://stackoverflow.com/questions/13297155/add-floating-axis-labels-in-facet-wrap-plot
library(grid)
facetAdjust <- function(x, pos = c("up", "down"),
newpage = is.null(vp), vp = NULL)
{
# part of print.ggplot
ggplot2:::set_last_plot(x)
if(newpage)
grid.newpage()
pos <- match.arg(pos)
p <- ggplot_build(x)
gtable <- ggplot_gtable(p)
# finding dimensions
dims <- apply(p$panel$layout[2:3], 2, max)
nrow <- dims[1]
ncol <- dims[2]
# number of panels in the plot
panels <- sum(grepl("panel", names(gtable$grobs)))
space <- ncol * nrow
# missing panels
n <- space - panels
# checking whether modifications are needed
if(panels != space){
# indices of panels to fix
idx <- (space - ncol - n + 1):(space - ncol)
# copying x-axis of the last existing panel to the chosen panels
# in the row above
gtable$grobs[paste0("axis_b",idx)] <- list(gtable$grobs[[paste0("axis_b",panels)]])
if(pos == "down"){
# if pos == down then shifting labels down to the same level as
# the x-axis of last panel
rows <- grep(paste0("axis_b\\-[", idx[1], "-", idx[n], "]"),
gtable$layout$name)
lastAxis <- grep(paste0("axis_b\\-", panels), gtable$layout$name)
gtable$layout[rows, c("t","b")] <- gtable$layout[lastAxis, c("t")]
}
}
# again part of print.ggplot, plotting adjusted version
if(is.null(vp)){
grid.draw(gtable)
}
else{
if (is.character(vp))
seekViewport(vp)
else pushViewport(vp)
grid.draw(gtable)
upViewport()
}
invisible(p)
}
d <- dat %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
mutate(questions = strsplit(questions, ", ")) %>%
unnest(questions) %>%
filter(sapply(questions, grepl, x=allowed_questions), sapply(products, grepl, x=allowed_products)) %>%
ggplot(aes(x = questions)) +
geom_bar(stat="count") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1)) +
facet_wrap( ~ products, ncol = 3)
facetAdjust(d)
```
Did people get timely responses when they needed help?
```{r, echo=FALSE}
dat %>%
filter(timely_response != "") %>%
ggplot(aes(x = timely_response)) +
geom_bar(stat = "count")
```
#Fill-in answers
I tried to tease from each like and dislike what their key points were. I have a histogram of those here. Note a few things: People are happy to have access to the data, TMO is confusing to use and limited (they often want to do MOAR!), and people want better documentation (which we know). Next steps is take each of these responses and tease apart bugs and feature requests, and see if they seem like things we should work on. I've started doing that in the original spreadsheet.
```{r, echo=FALSE}
dat %>%
mutate(likes = strsplit(like_short, ", ")) %>%
unnest(likes) %>%
ggplot(aes(x = likes)) +
geom_bar(stat = "count") +
ggtitle("What do you like about TMO?") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1))
```
```{r, echo=FALSE}
dat %>%
mutate(dislikes = strsplit(dislike_short, ", ")) %>%
unnest(dislikes) %>%
ggplot(aes(x = dislikes)) +
geom_bar(stat = "count") +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1)) +
ggtitle("What do you dislike?")
```
This plot is trying to tease out dislikes per product. It seems like 1) documentation is a big issue for STMO, and 2) histogram users find TMO confusing.
```{r, echo=FALSE, fig.height=10}
dat %>%
mutate(dislikes = strsplit(dislike_short, ", ")) %>%
unnest(dislikes) %>%
mutate(products = strsplit(products, ", ")) %>%
unnest(products) %>%
filter(sapply(products, grepl, allowed_products)) %>%
ggplot(aes(x = dislikes)) +
geom_bar(stat = "count") +
facet_wrap(~ products, ncol = 1) +
theme(axis.text.x = element_text(angle = 330, hjust = 0.1))
```