3 Lab II: Introduction to `library(tidyverse)` & R Markdown

We can use R Markdown to create well-formatted PDFs or .html files that can easily display the results of our analyses. R Markdown, through Latex, also allows to write mathematical formulas with ease. Go ahead and knit - found in the top left corner - this file now and see what it looks like.

3.1 In the setup chunk above, load the tidyverse packages as well as library(readr)

## Example Setup Chunk

knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
## Packages
library(readr)
library(tidyverse)

3.2 Load in the resume.RData file and use head(), tail(), glimpse(), dim(), summary(), and View() to examine each variable in the dataset. How many of the resumes have white sounding names? How many have African-American sounding names.

## Loading Data
data(resume, package = "qss")

## Learning About the Dataset

head(resume)

##   firstname    sex  race call
## 1   Allison female white    0
## 2   Kristen female white    0
## 3   Lakisha female black    0
## 4   Latonya female black    0
## 5    Carrie female white    0
## 6       Jay   male white    0

tail(resume)

##      firstname    sex  race call
## 4865   Lakisha female black    0
## 4866    Tamika female black    0
## 4867     Ebony female black    0
## 4868       Jay   male white    0
## 4869   Latonya female black    0
## 4870    Laurie female white    0

glimpse(resume)

## Rows: 4,870
## Columns: 4
## $ firstname <chr> "Allison", "Kristen", "Lakisha", "Latonya", "Carrie", "Jay", "Jill", "Kenya", "Latonya", "Tyrone", "Ai…
## $ sex       <chr> "female", "female", "female", "female", "female", "male", "female", "female", "female", "male", "femal…
## $ race      <chr> "white", "white", "black", "black", "white", "white", "white", "black", "black", "black", "black", "wh…
## $ call      <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …

dim(resume)

## [1] 4870    4

summary(resume)

##   firstname             sex                race                call        
##  Length:4870        Length:4870        Length:4870        Min.   :0.00000  
##  Class :character   Class :character   Class :character   1st Qu.:0.00000  
##  Mode  :character   Mode  :character   Mode  :character   Median :0.00000  
##                                                           Mean   :0.08049  
##                                                           3rd Qu.:0.00000  
##                                                           Max.   :1.00000

View(resume) ## Comment this out when knitting.

## Number of observations by race
resume %>%
  group_by(race) %>%
  count()

## # A tibble: 2 × 2
## # Groups:   race [2]
##   race      n
##   <chr> <int>
## 1 black  2435
## 2 white  2435

3.3 This experiment seeks to determine whether or not hiring managers discriminate on the basis of racial identity by sending idential resumes with African-American and white sounding names to job postings. The basic logic is that resumes are identical and only the name is changing, so any differences in call backs for jobs can be attributed to racial discrimination. Why do the authors want to randomize? And do you think this is an effective research design?

The concerns of examining race and the number of callbacks without randomization is that there could be confounders like workplace connections, amount of education, and employment history that could be correlated with race. It is possible that African-American applicants did not have the same employment and educational opportunities as white Americans, and, therefore, their resumes may look significantly different. This raises issues of counfounding and makes it impossible to differentiate if an employer made their decision based on race or based on the substance of the resume. The authors, though, randomized race, reducing this risk of confounding. Employers in this study are seeing nearly identical resumes, with only the race of the applicant being different, as indicated by a name.

The field experiment presented here relies on racial connotations of different names, not explicit racial cues. Therefore, hiring managers are determining an applicant’s race largely based on what scholars of identity politics would call “perceived race” or “street race” (Lopez et al. 2017), which is how others perceive an individual’s race. This fact means that the selection of names is integral to the internal validity of the research design.

3.4 We are going to see if there is a racial discrepency by taking the difference in callback rates between racial groups. Calculate the callback rate for white sounding name applicants and African-American sounding name applicants. Use Latex commands to write the formula for this calculation and display the result in text. Write the formula between $’s like $y = mx + b$ to use Latex commands.

## Call Back for white Sounding Name Applicants
resume %>%
  group_by(race) %>%
  summarise(callback_rates = mean(call))

## # A tibble: 2 × 2
##   race  callback_rates
##   <chr>          <dbl>
## 1 black         0.0645
## 2 white         0.0965

The callback rate for whites is .096. We take the mean of the binary callback variable, $\overline{x} = \frac{1}{n}\Sigma^{n}_{i=1}x_i$

The callback rate for African-American sounding name applicants is .064.

3.5 Now, create a new object that stores the difference in callback rates named race_diff.

## Calculating Callback Proportions
race_call <- resume %>%
  group_by(race, call) %>%
  count() %>%
  pivot_wider(names_from = call,
              values_from = n) %>%
  rename(no_call = `0`,
         call = `1`) %>%
  mutate(total_resumes = no_call + call,
         call_prop = call / total_resumes)

## Difference in call back rates
race_diff <- race_call %>% 
  select(race, call_prop) %>%
  pivot_wider(names_from = c(race),
              values_from = call_prop) %>%
  mutate(race_diff = white - black) %>%
  select(race_diff)

## Printing
race_diff

## # A tibble: 1 × 1
##   race_diff
##       <dbl>
## 1    0.0320

3.6 Since Crenshaw (1989), manny scholars have concerned with intersectionality, or how race and gender interact to make the experiences of African-American women unique. We can use the data we have to explore the effect of race and gender specific sounding names on employment prospects. Calculate the call back rate by each race and gender category.

## Callbacks by race and gender
resume %>%
  group_by(race, call, sex) %>%
  count() %>%
  pivot_wider(names_from = call,
              values_from = n) %>%
  rename(no_call = `0`,
         call = `1`) %>%
  mutate(total_resumes = no_call + call,
         call_prop = call / total_resumes)

## # A tibble: 4 × 6
## # Groups:   race, sex [4]
##   race  sex    no_call  call total_resumes call_prop
##   <chr> <chr>    <int> <int>         <int>     <dbl>
## 1 black female    1761   125          1886    0.0663
## 2 black male       517    32           549    0.0583
## 3 white female    1676   184          1860    0.0989
## 4 white male       524    51           575    0.0887

3.7 What is the difference in call back rates for each race/gender group?

## Saving tibble from 8
dta <- resume %>%
  group_by(race, call, sex) %>%
  count() %>%
  pivot_wider(names_from = call,
              values_from = n) %>%
  rename(no_call = `0`,
         call = `1`) %>%
  mutate(total_resumes = no_call + call,
         call_prop = call / total_resumes)

## Calculating Differences
call_backs <- dta %>% 
  select(race, sex, call_prop) %>%
  pivot_wider(names_from = c(sex, race),
              values_from = call_prop) %>%
  mutate(white_sex_diff = female_white - male_white,
         black_sex_diff = female_black - male_black,
         male_race_diff = male_white - male_black,
         female_race_diff = female_white - female_black) %>%
  select(white_sex_diff, black_sex_diff, male_race_diff, female_race_diff)

## Printing
print(call_backs) ## print() is optional

## # A tibble: 1 × 4
##   white_sex_diff black_sex_diff male_race_diff female_race_diff
##            <dbl>          <dbl>          <dbl>            <dbl>
## 1         0.0102        0.00799         0.0304           0.0326

3 Lab II: Introduction to library(tidyverse) & R Markdown

3.1 In the setup chunk above, load the tidyverse packages as well as library(readr)

3.2 Load in the resume.RData file and use head(), tail(), glimpse(), dim(), summary(), and View() to examine each variable in the dataset. How many of the resumes have white sounding names? How many have African-American sounding names.

3.5 Now, create a new object that stores the difference in callback rates named race_diff.

3.7 What is the difference in call back rates for each race/gender group?

3 Lab II: Introduction to `library(tidyverse)` & R Markdown