3 Lab II: Introduction to library(tidyverse)
& R Markdown
We can use R Markdown to create well-formatted PDFs or .html files that can easily display the results of our analyses. R Markdown, through Latex, also allows to write mathematical formulas with ease. Go ahead and knit - found in the top left corner - this file now and see what it looks like.
3.1 In the setup chunk above, load the tidyverse packages as well as library(readr)
## Example Setup Chunk
::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
knitr## Packages
library(readr)
library(tidyverse)
3.2 Load in the resume.RData file and use head(), tail(), glimpse(), dim(), summary(), and View() to examine each variable in the dataset. How many of the resumes have white sounding names? How many have African-American sounding names.
## Loading Data
data(resume, package = "qss")
## Learning About the Dataset
head(resume)
## firstname sex race call
## 1 Allison female white 0
## 2 Kristen female white 0
## 3 Lakisha female black 0
## 4 Latonya female black 0
## 5 Carrie female white 0
## 6 Jay male white 0
tail(resume)
## firstname sex race call
## 4865 Lakisha female black 0
## 4866 Tamika female black 0
## 4867 Ebony female black 0
## 4868 Jay male white 0
## 4869 Latonya female black 0
## 4870 Laurie female white 0
glimpse(resume)
## Rows: 4,870
## Columns: 4
## $ firstname <chr> "Allison", "Kristen", "Lakisha", "Latonya", "Carrie", "Jay", "Jill", "Kenya", "Latonya", "Tyrone", "Ai…
## $ sex <chr> "female", "female", "female", "female", "female", "male", "female", "female", "female", "male", "femal…
## $ race <chr> "white", "white", "black", "black", "white", "white", "white", "black", "black", "black", "black", "wh…
## $ call <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
dim(resume)
## [1] 4870 4
summary(resume)
## firstname sex race call
## Length:4870 Length:4870 Length:4870 Min. :0.00000
## Class :character Class :character Class :character 1st Qu.:0.00000
## Mode :character Mode :character Mode :character Median :0.00000
## Mean :0.08049
## 3rd Qu.:0.00000
## Max. :1.00000
View(resume) ## Comment this out when knitting.
## Number of observations by race
%>%
resume group_by(race) %>%
count()
## # A tibble: 2 × 2
## # Groups: race [2]
## race n
## <chr> <int>
## 1 black 2435
## 2 white 2435
3.4 We are going to see if there is a racial discrepency by taking the difference in callback rates between racial groups. Calculate the callback rate for white sounding name applicants and African-American sounding name applicants. Use Latex commands to write the formula for this calculation and display the result in text. Write the formula between $’s like \(y = mx + b\) to use Latex commands.
## Call Back for white Sounding Name Applicants
%>%
resume group_by(race) %>%
summarise(callback_rates = mean(call))
## # A tibble: 2 × 2
## race callback_rates
## <chr> <dbl>
## 1 black 0.0645
## 2 white 0.0965
The callback rate for whites is .096. We take the mean of the binary callback variable, \(\overline{x} = \frac{1}{n}\Sigma^{n}_{i=1}x_i\)
The callback rate for African-American sounding name applicants is .064.
3.5 Now, create a new object that stores the difference in callback rates named race_diff.
## Calculating Callback Proportions
<- resume %>%
race_call group_by(race, call) %>%
count() %>%
pivot_wider(names_from = call,
values_from = n) %>%
rename(no_call = `0`,
call = `1`) %>%
mutate(total_resumes = no_call + call,
call_prop = call / total_resumes)
## Difference in call back rates
<- race_call %>%
race_diff select(race, call_prop) %>%
pivot_wider(names_from = c(race),
values_from = call_prop) %>%
mutate(race_diff = white - black) %>%
select(race_diff)
## Printing
race_diff
## # A tibble: 1 × 1
## race_diff
## <dbl>
## 1 0.0320
3.6 Since Crenshaw (1989), manny scholars have concerned with intersectionality, or how race and gender interact to make the experiences of African-American women unique. We can use the data we have to explore the effect of race and gender specific sounding names on employment prospects. Calculate the call back rate by each race and gender category.
## Callbacks by race and gender
%>%
resume group_by(race, call, sex) %>%
count() %>%
pivot_wider(names_from = call,
values_from = n) %>%
rename(no_call = `0`,
call = `1`) %>%
mutate(total_resumes = no_call + call,
call_prop = call / total_resumes)
## # A tibble: 4 × 6
## # Groups: race, sex [4]
## race sex no_call call total_resumes call_prop
## <chr> <chr> <int> <int> <int> <dbl>
## 1 black female 1761 125 1886 0.0663
## 2 black male 517 32 549 0.0583
## 3 white female 1676 184 1860 0.0989
## 4 white male 524 51 575 0.0887
3.7 What is the difference in call back rates for each race/gender group?
## Saving tibble from 8
<- resume %>%
dta group_by(race, call, sex) %>%
count() %>%
pivot_wider(names_from = call,
values_from = n) %>%
rename(no_call = `0`,
call = `1`) %>%
mutate(total_resumes = no_call + call,
call_prop = call / total_resumes)
## Calculating Differences
<- dta %>%
call_backs select(race, sex, call_prop) %>%
pivot_wider(names_from = c(sex, race),
values_from = call_prop) %>%
mutate(white_sex_diff = female_white - male_white,
black_sex_diff = female_black - male_black,
male_race_diff = male_white - male_black,
female_race_diff = female_white - female_black) %>%
select(white_sex_diff, black_sex_diff, male_race_diff, female_race_diff)
## Printing
print(call_backs) ## print() is optional
## # A tibble: 1 × 4
## white_sex_diff black_sex_diff male_race_diff female_race_diff
## <dbl> <dbl> <dbl> <dbl>
## 1 0.0102 0.00799 0.0304 0.0326