Working with Two Datasets: Binds, Set Operations, and Joins – Pt 4 Intro to Data Manipulation
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
dplyr docs: dplyr.tidyverse.org/reference/
http://dplyr.tidyverse.org/reference/setops.html http://dplyr.tidyverse.org/reference/join.html Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw /01:44 Intro and what’s covered Ground Rules: /02:40 What’s a tibble /04:50 Use View /05:25 The Pipe operator: /07:20 What do I mean by data wrangling? Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM /00:48 Goal 1 Making your data suitable for R /01:40 tidyr “Tidy” Data introduced and motivated /08:10 tidyr::gather /12:30 tidyr::spread /15:23 tidyr::unite /15:23 tidyr::separate Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U /00.40 setup /02:00 dplyr::select /03:40 dplyr::filter /05:05 dplyr::mutate /07:05 dplyr::summarise /08:30 dplyr::arrange /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation) /11:45 dplyr::group_by Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
00.42 dplyr::bind_cols 01:27 dplyr::bind_rows 01:42 Set operations dplyr::union, dplyr::intersect, dplyr::set_diff 02:15 joining data - dplyr::left_join, dplyr::inner_join, - dplyr::right_join, dplyr::full_join, Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Documentation: tidyr docs: tidyr.tidyverse.org/reference/
tidyr vignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html dplyr docs: http://dplyr.tidyverse.org/reference/ dplyr one-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html dplyr two-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
cheatsheets
dplyr
rstudio
tibble
tidyr
tidyverse
tidyverse.org
Grammar of Data Manipulation
Data Science
Data Wrangling
Applied Statistics
Statistics
RStudio
Data Manipulation
Joins
Merge
RStats
Two Tables
Two Tibbles
Two Datasets
Two Data.frames
Two Dataframes
Two Tables at the Same Time
Intersect
Union
Setdiff
Left_join
Right_join
Full_join
Anti_join