A guide for learning the basic tools on data analysis: process, visualize and learn from your data using R programming.
This repository holds the necessary data sets for the book "An Introduction to Data Analysis in R", to be published by Springer series Use R!. The book can be purchased in https://www.springer.com/gp/book/9783030489960.
The book is meant to be an introductory guide to manipulate data sets in the Big Data paradigm. One of the main goals of this book is to take the analyst from the very first moment when she/he contacts with data to the final conclusion and presentation of the results of the analysis. We take into account the variety of fields where data analysis occurs nowadays. We pay special attention to the different ways to obtain data and how to make it manageable before starting the analysis. The data analysis includes most of the basic visualization options and some advanced extra options. Finally, basic statistics and mathematical background is used to learn from the processed data. The text includes a crash course on programming, the needed data processing tools placing, and the analyst at the starting point to extract information from data. Through statistical analysis plus visualization of this information, many conclusions can be extracted to understand the meaning of a database.
The book has grown out of the authors' experience teaching undergraduate students in business degrees and our personal research in different fields. Given the increasing demand on data analysis skills, our aim is to provide students with a complete guide on the basic tools and necessary practice to excel in their work field.
This repository contains the necessary files to do several exercises and examples that can be found in the book. To download them, one can simply click on Clone or download and then on Download ZIP. This will download the full repository except file flights.csv, which must be accessed separately due to its large size. To do so, simply enter the datasets folder, click on flights.csv and then click on download. This will open the file in a browser window from which it can be downloaded (the process may take several minutes).
Folder codes contains an R markdown file with all code snippets in the book:
- File Codes R book.Rmd.
- File Codes-R-book.html and folder files.
Folder datasets contains files used in exercises and examples (sections 3.2 and 3.3):
- File easy.csv. Used in Exercise 3.2.
- File hard.csv. Used in Exercise 3.2.
- File flights.json. Used in Exercise 3.3.
- File flights.xlsx. Used in Exercise 3.3.
- File flights.csv. Used in Section 3.3.3 Practical examples
Folder webs contains website files for the scrapping examples and exercises (section 3.2)
- File RottenTomatoes.htm. Used in Exercise 3.7.
- File GoodReads.mhtml. Used in Exercise 3.8.
- File NBA 19_20.mhtml. Used in Section 3.2.3.