From 8d9dd896b8f8bf26ceed106b54d9ab8e2e99ea5d Mon Sep 17 00:00:00 2001 From: jayware9 Date: Thu, 10 Aug 2023 13:40:24 +0100 Subject: [PATCH 1/3] Session 1: The incomplete text is more up to date, so copy that across to the complete version --- session1/intro_to_r_training.Rmd | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/session1/intro_to_r_training.Rmd b/session1/intro_to_r_training.Rmd index f66dfb2..4d1af8f 100644 --- a/session1/intro_to_r_training.Rmd +++ b/session1/intro_to_r_training.Rmd @@ -638,6 +638,7 @@ iris$Species <- as.factor(iris$Species) ## 3 Data wrangling and 'group by' calculations + ### 3.1 Filter To start off with a simple data wrangling function; if you would like to produce statistics for a subset of rows or observations, a good function to use is filter() from the dplyr package. @@ -780,6 +781,7 @@ sepal_length_average <- iris %>% ``` + ### 3.3 Select @@ -820,7 +822,7 @@ iris_petals <- iris %>% -We can rename variables using the dplyr function rename(). Let's amend our above coding in creating the 'iris_petals' dataset so that Petal.Length is just calles P.Length, and Petal.Width is P.Width. +We can rename variables using the dplyr function rename(). Let's amend our above coding in creating the 'iris_petals' dataset so that Petal.Length is just called P.Length, and Petal.Width is P.Width. @@ -868,7 +870,7 @@ iris_petals <- iris_petals %>% -Another useful function found in the dplyr package is if_else, which works in a similar way to if statements in Excel. This uses a logical statement to determine the output. The below code uses this to identify petals that are less than 2 cm long, the mutate function is used to add a variable in to the offenders dataset which is 1 if the petal is less than 2 cm and 0 if it is 2 cm or more. +Another useful function found in the dplyr package is if_else, which works in a similar way to if statements in Excel. This uses a logical statement to determine the output. The below code uses this to identify petals that are less than 2 cm long, the mutate function is used to add a variable in to the iris dataset which is 1 if the petal is less than 2 cm and 0 if it is 2 cm or more. @@ -1011,7 +1013,7 @@ write_csv(iris_petals, path = "iris_petals.csv") -This assumes by default that the columns are separated by a comma symbol. The data will be saved as a CSV in your working directory to a file called `iris_petals.csv`. +This assumes by default that the columns are separated by a comma symbol. The data will be saved as a CSV in your working directory to a file called `iris_petals.csv`. @@ -1037,8 +1039,6 @@ This assumes by default that the columns are separated by a comma symbol. The da There are lots of resources that can help you develop your R knowledge, but below are a few that are particularly helpful: - - + Scottish Government 'Good Coding Practices': https://github.com/DataScienceScotland/good_practices/blob/main/coding.md + DataCamp is a website which hosts multiple online courses that teach coding. Their 'Introduction to R' course is free to complete and provides a broader overview in the basic concepts for coding in R. A link to the course can be found here: https://www.datacamp.com/courses/free-introduction-to-r. From 429f5fe3a348732f299d82e59c72450b4c10b4ff Mon Sep 17 00:00:00 2001 From: jayware9 Date: Thu, 10 Aug 2023 13:46:52 +0100 Subject: [PATCH 2/3] Session 1: dataset still being referred to as offenders data in places, update to iris --- session1/intro_to_r_training.Rmd | 6 +++--- session1/intro_to_r_training_incomplete.Rmd | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/session1/intro_to_r_training.Rmd b/session1/intro_to_r_training.Rmd index 4d1af8f..9e0f987 100644 --- a/session1/intro_to_r_training.Rmd +++ b/session1/intro_to_r_training.Rmd @@ -494,7 +494,7 @@ The format is dataframe name, $, variable name. Note that a vector is returned. -All variables have an associated class. The class will determine what calculations are possible with them and how R should treat them. So far, our dataset offenders has variables of three different classes; integer, number, and character. Other useful types are factor, logical and date. +All variables have an associated class. The class will determine what calculations are possible with them and how R should treat them. So far, our dataset iris has variables of three different classes; integer, number, and character. Other useful types are factor, logical and date. @@ -510,7 +510,7 @@ class(iris$Sepal.Length) -It's possible to coerce variables from one class to another. We can change the Sepal.Length variable in the offenders dataset to be an integer variable as follows: +It's possible to coerce variables from one class to another. We can change the Sepal.Length variable in the iris dataset to be an integer variable as follows: @@ -546,7 +546,7 @@ iris Another common class is factors. -Factors are for categorical variables involving different levels. So for example, in the dataset 'iris', there are 3 levels of Species: setosa, versicolor, virginica. We can see this now when looking at the environment tab (after clicking the arrow to the left of offenders) and also the order from using the following command: +Factors are for categorical variables involving different levels. So for example, in the dataset 'iris', there are 3 levels of Species: setosa, versicolor, virginica. We can see this now when looking at the environment tab (after clicking the arrow to the left of iris) and also the order from using the following command: diff --git a/session1/intro_to_r_training_incomplete.Rmd b/session1/intro_to_r_training_incomplete.Rmd index 3fad51f..e13a42a 100644 --- a/session1/intro_to_r_training_incomplete.Rmd +++ b/session1/intro_to_r_training_incomplete.Rmd @@ -494,7 +494,7 @@ The format is dataframe name, $, variable name. Note that a vector is returned. -All variables have an associated class. The class will determine what calculations are possible with them and how R should treat them. So far, our dataset offenders has variables of three different classes; integer, number, and character. Other useful types are factor, logical and date. +All variables have an associated class. The class will determine what calculations are possible with them and how R should treat them. So far, our dataset iris has variables of three different classes; integer, number, and character. Other useful types are factor, logical and date. @@ -510,7 +510,7 @@ class(iris$Sepal.Length) -It's possible to coerce variables from one class to another. We can change the Sepal.Length variable in the offenders dataset to be an integer variable as follows: +It's possible to coerce variables from one class to another. We can change the Sepal.Length variable in the iris dataset to be an integer variable as follows: @@ -546,7 +546,7 @@ iris Another common class is factors. -Factors are for categorical variables involving different levels. So for example, in the dataset 'iris', there are 3 levels of Species: setosa, versicolor, virginica. We can see this now when looking at the environment tab (after clicking the arrow to the left of offenders) and also the order from using the following command: +Factors are for categorical variables involving different levels. So for example, in the dataset 'iris', there are 3 levels of Species: setosa, versicolor, virginica. We can see this now when looking at the environment tab (after clicking the arrow to the left of offender) and also the order from using the following command: From 4ca1ed398e45f48bb40a1dfd474210d082f50467 Mon Sep 17 00:00:00 2001 From: jayware9 Date: Thu, 10 Aug 2023 13:49:08 +0100 Subject: [PATCH 3/3] Session 2: fix typo in complete version --- session2/intro_to_R_session2.Rmd | 8 ++++++-- session2/intro_to_R_session2_incomplete.Rmd | 2 +- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/session2/intro_to_R_session2.Rmd b/session2/intro_to_R_session2.Rmd index 624ef2c..81c804b 100644 --- a/session2/intro_to_R_session2.Rmd +++ b/session2/intro_to_R_session2.Rmd @@ -216,7 +216,7 @@ time_series_plot3 ``` -Now that we can see the plots, let's fit lines, to the points, and a straigh line for the trend +Now that we can see the plots, let's fit lines, to the points, and a straight line for the trend ```{r TimeSeriesPlot4} @@ -247,7 +247,11 @@ time_series_plot5 ``` -Finally, we could perform the wrangling and plotting in a concise chunk +Finally, we could perform the wrangling and plotting in a concise chunk + +Exercise: + + See if you can accomplish the same thing as the chunks above to produce the time_series_plot5 object in one chunk ```{r AllAnalysis} #Import diff --git a/session2/intro_to_R_session2_incomplete.Rmd b/session2/intro_to_R_session2_incomplete.Rmd index df62162..a706858 100644 --- a/session2/intro_to_R_session2_incomplete.Rmd +++ b/session2/intro_to_R_session2_incomplete.Rmd @@ -252,7 +252,7 @@ time_series_plot5 ``` -Finally, we could perform the wrangling and plotting in a concise chunk +Finally, we could perform the wrangling and plotting in a concise chunk Exercise: