forked from kbenoit/ITAUR
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtest_setup.Rmd
59 lines (39 loc) · 1.48 KB
/
test_setup.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
title: "Setup test"
author: "Kenneth Benoit and Paul Nulty"
date: "18 October 2015"
output: html_document
---
#### Preliminaries: Installation
First, you need to have **quanteda** installed. You can do this from inside RStudio, from the Tools...Install Packages menu, or simply using
```{r, eval = FALSE}
install.packages("quanteda", dependencies = TRUE)
```
(Optional) You can install some additional corpus data from **quantedaData** using
```{r, eval=FALSE}
## the devtools package is required to install quanteda from Github
devtools::install_github("kbenoit/quantedaData")
```
Note that on **Windows platforms**, it is also recommended that you install the [RTools suite](https://cran.r-project.org/bin/windows/Rtools/), and for **OS X**, that you install [XCode](https://itunes.apple.com/gb/app/xcode/id497799835?mt=12) from the App Store.
#### Test your setup
Run the rest of this file to test your setup. You must have quanteda installed in order for this next step to succeed.
```{r}
require(quanteda)
```
Now summarize some texts in the Irish 2010 budget speech corpus:
```{r}
summary(ie2010Corpus)
```
Create a document-feature matrix from this corpus, removing stop words:
```{r}
ieDfm <- dfm(ie2010Corpus, ignoredFeatures = c(stopwords("english"), "will"), stem = TRUE)
```
Look at the top occuring features:
```{r}
topfeatures(ieDfm)
```
Make a word cloud:
```{r, fig.width=8, fig.height=8}
plot(ieDfm, min.freq=25, random.order=FALSE)
```
If you got this far, congratulations!