Welcome to Data Analysis for the Social Sciences in R!
|
In this class, we operate as a team. That means respecting one another's opinions,
confusions, insights, interests, identities, and ways of seeing the world.
I design the class with the consistent intention of not overwhelming you. I might fail.
If you need help...
1. Google your question starting with "R" (I'm not being cheeky) - believe it or not, googling questions about code is an acquired skill.
I will help you learn how to create useful google searches to answer your coding questions.
2. In the console, type ? before a function name you want to know more about e.g. "?select", the help information will pop up,
type ?? to find what package the function is in.
3. Post your question to Slack, aka ask your classmates for input
4. Email me at [email protected]
5. Come to my virtual office hours Mondays 12-1pm (sign up)
6. Find a free moment during class
Remember: R is sensitive to detail, var-name ≠ var_name ≠ Var-Name, etc.
Tasks are color-coded as follows:
Watch a linked video. Practice skills and concepts. Read a linked text.
Each of these tasks is ungraded (but highly valuable) and to be completed after the day of class under which it's listed.
Many of these tasks are meant to prepare you for the following class's topic.
Weekly quizzes are not graded and will help you refresh concepts from class prep/previous classes.
Homework will be graded. You will turn in homework via GitHub.
That means you will clone the homework repository locally, then edit, add, commit, and push your changes to the main branch for me to see.
Your final project will be either a team project (undergrad) or a solo project (grad) on a chosen question and dataset.
Using RMarkdown, you'll report your question, a description of the dataset, and the methods and visualizations you used to explore the question.
Finally, you'll publish this report as a website via GitHub Pages.
confusions, insights, interests, identities, and ways of seeing the world.
I design the class with the consistent intention of not overwhelming you. I might fail.
If you need help...
1. Google your question starting with "R" (I'm not being cheeky) - believe it or not, googling questions about code is an acquired skill.
I will help you learn how to create useful google searches to answer your coding questions.
2. In the console, type ? before a function name you want to know more about e.g. "?select", the help information will pop up,
type ?? to find what package the function is in.
3. Post your question to Slack, aka ask your classmates for input
4. Email me at [email protected]
5. Come to my virtual office hours Mondays 12-1pm (sign up)
6. Find a free moment during class
Remember: R is sensitive to detail, var-name ≠ var_name ≠ Var-Name, etc.
Tasks are color-coded as follows:
Watch a linked video. Practice skills and concepts. Read a linked text.
Each of these tasks is ungraded (but highly valuable) and to be completed after the day of class under which it's listed.
Many of these tasks are meant to prepare you for the following class's topic.
Weekly quizzes are not graded and will help you refresh concepts from class prep/previous classes.
Homework will be graded. You will turn in homework via GitHub.
That means you will clone the homework repository locally, then edit, add, commit, and push your changes to the main branch for me to see.
Your final project will be either a team project (undergrad) or a solo project (grad) on a chosen question and dataset.
Using RMarkdown, you'll report your question, a description of the dataset, and the methods and visualizations you used to explore the question.
Finally, you'll publish this report as a website via GitHub Pages.
Important things to do before out first day of class:
Software setup. <- Set aside time for this.
Read: R for Data Science (R4DS) Chapter 2, Introduction to exploratory data analysis (very short)
3/29 - Week 1 Introductions: Tidy data, the importance of Viz, and GitHub
Tues: Meet R, Rstudio, tidy data, viz, and course overview
In class exercise: checking everyone's set up, open and run emailed viz code
Read:
Thurs: Meet GitHub
In class exercise: clone your first repo, edit, add, commit, and push changes to the main branch.
Watch: Why viz & A mental model for ggplot2
Practice: R Bootcamp Chapter 1 Sections 1 - 9
4/5 - Week 2 Visualization in ggplot2
Tues: Remeet ggplot2
In class exercise: meet the geoms (live coding)
Thurs: No class
Individual homework due 4/19:
4/12 - Week 3 Finding and loading data, collaborating in GitHub
Tues: readr, base R loading, and data types
Read:
Thurs: Collaborating with GitHub
Read: Practice: Work with teammates on team hw 1 - establishing your questions
Team homework due 4/21:
4/19 - Week 4 Data wrangling with dplyr
Tues: select, filter, operators, mutate Thurs: group_by, summarize, and custom functions Individual homework due 5/3:
4/26 - Week 5 Reporting in Rmarkdown
Tues: titles, formats, parameters
Read: Thurs: themes, links
Individual homework due 5/10:
5/3 - Week 6 Wrangling revisited
Tues: relational data and join functions
Read: Thurs: custom functions, filter with grepl
5/10 - Week 7 Storytelling: data viz revisited
Tues: color palettes, legends, custom topics, labels, facets, captions
Read:
5/17 - Week 8 Team updates & feedback
Tues: Groups: xgamesmode, poggers, rodeofrogge
Thurs: Groups dogtorphil, curlycoders, zhian21
5/24 - Week 9 Hypothesis testing in R
Tues: What is a significant pattern?
6/1 - Week 10 Group presentations
Tues: dogtorphil, curlycoders, zhian21
Thurs: xgamesmode, poggers, rodeofrogge
Due 6/7: Link to final Rmarkdown report published on GitHub Pages
Software setup. <- Set aside time for this.
Read: R for Data Science (R4DS) Chapter 2, Introduction to exploratory data analysis (very short)
3/29 - Week 1 Introductions: Tidy data, the importance of Viz, and GitHub
Tues: Meet R, Rstudio, tidy data, viz, and course overview
In class exercise: checking everyone's set up, open and run emailed viz code
Read:
- R4DS Chapter 4.1 - 2, Workflow basics
- Intro to Git & GitHub with Rstudio (short, assigned reading is up to "Branching")
- ReadMe.md in github-starter-course in GitHub class organization (also short)
Thurs: Meet GitHub
In class exercise: clone your first repo, edit, add, commit, and push changes to the main branch.
Watch: Why viz & A mental model for ggplot2
Practice: R Bootcamp Chapter 1 Sections 1 - 9
4/5 - Week 2 Visualization in ggplot2
Tues: Remeet ggplot2
In class exercise: meet the geoms (live coding)
Thurs: No class
Individual homework due 4/19:
- hw1 ggplot2 in GitHub class organization: clone, edit following instructions, add changes, commit and push to main branch
4/12 - Week 3 Finding and loading data, collaborating in GitHub
Tues: readr, base R loading, and data types
Read:
- Reading data files into R
- R data types and structures [https://swcarpentry.github.io/r-novice-inflammation/13-supp-data-structures/]
Thurs: Collaborating with GitHub
Read: Practice: Work with teammates on team hw 1 - establishing your questions
Team homework due 4/21:
- team_hw1 in Github class organization. Find 2 candidate datasets and related research questions
4/19 - Week 4 Data wrangling with dplyr
Tues: select, filter, operators, mutate Thurs: group_by, summarize, and custom functions Individual homework due 5/3:
- hw2 wrangling in GitHub classroom
4/26 - Week 5 Reporting in Rmarkdown
Tues: titles, formats, parameters
Read: Thurs: themes, links
Individual homework due 5/10:
- hw3 in GitHub class organization - Rmarkdown and wrangling cont'd
5/3 - Week 6 Wrangling revisited
Tues: relational data and join functions
Read: Thurs: custom functions, filter with grepl
- Fundamentals of Data Visualization Chapter 29.1 – 29.2
- R4DS Chapter 3.3-3.10 (some topics covered already very familiar)
- team_hw2 in GitHub class organization
5/10 - Week 7 Storytelling: data viz revisited
Tues: color palettes, legends, custom topics, labels, facets, captions
Read:
- Special read for Thursday's class: redlining and neighborhood temperature
- Viridis color palettes
- Cheatsheet of ggplot functions for customizing plots
5/17 - Week 8 Team updates & feedback
Tues: Groups: xgamesmode, poggers, rodeofrogge
Thurs: Groups dogtorphil, curlycoders, zhian21
5/24 - Week 9 Hypothesis testing in R
Tues: What is a significant pattern?
- highly recommended reading/viewing:
- Are you a Bayesian or Frequentist thinker? (you can be both)
- Insight to probability from a Bayesian perspective (15 min)
- video from class: Probability, Garden of Forking Data, Bayesian inference (McElreath, full vid 1:12)
- more comparisons of Bayes vs Freq
6/1 - Week 10 Group presentations
Tues: dogtorphil, curlycoders, zhian21
Thurs: xgamesmode, poggers, rodeofrogge
Due 6/7: Link to final Rmarkdown report published on GitHub Pages
Grading:
Homeworks 50 pts
Final project 30 pts
Class resources
Homeworks 50 pts
- 4/19: individual hw1 - ggplot2 - 10 pts
- 5/3: individual hw2 - wrangling - 12 pts
- 5/10: individual hw3 - Rmarkdown & wrangling revisited- 12 pts
- 4/21: team hw1 - candidate topics and datasets - 6 pts
- 5/12: team hw2 - official topic, dataset, data dictionary, additional variables, and transformations - 10 pts
Final project 30 pts
- 20 pts on written Rmd project
- 10 pts on presentation
- (5 points of each of the above will be graded by your teammates, i.e. the average of their 1-5 pt evaluations).
Class resources
- R bootcamp (interactive)
- R for Data Science (textbook)
- Fundamentals of Data Visualization (textbook)
- Cheatsheets! Shorthand guides to all packages
- Making R Markdown work better for you
- Happy Git and GitHub for the useR
- Sage debugging advice (Rstudio conf 2020 talk by Dr. Jenny Bryan)
- R graph gallery - for choosing visualizations and getting starter code
- R Markdown
- Basic statistical modeling in R
- Colors for ggplots