Data Wrangling I

You can’t do data science without data, and data aren’t going to wrangle themselves…

Data wrangling is the process of getting data in whatever form they exist and, through a variety of processes, turning those data into a form that suits your current needs. We’ll talk about how to get data in several common formats into R; how to transform, manage, and manipulate data in a cohesive way using dplyr; what it means for data to be “tidy” and how to make them so; and what to do when your data are spread across multiple tables.

The topic is made up of the following components:

Data import with readr et al
Data manipulation with dplyr
Tidy data and relational datasets

It has been argued that data carpentry is a better term than data wrangling. I only sorta like that, although it’s a useful analogy to consider.

The code that I produced working examples in lecture is here.