5 2/13 Lab IV | Project Runway
Our goal is to visualize the difference between the population percent (popPct) and the survey percent (svyPct) for various age groups. We’ll use the data in the table below (and of course, a full viz would include more subgroups). Create designs on how to present this information. Be ready to share concept and actual viz with the entire class. You can either work individually or in small groups. Do not include code with your visualizations. Instead, create an appendix that displays each code chunk at the end of the document. Make sure there are no warnings or messages displaying too.
Use the simulated data to make at least two plots: one in Base R and one in library(ggplot)
. Then you can use a dataset of your choice for the last two visualizations or keep working with the fake data.
age | popPct | svyPct |
---|---|---|
18 to 29 | 29 | 19 |
36 to 50 | 21 | 21 |
51 to 64 | 30 | 32 |
65+ | 20 | 28 |
5.5 Code Appendix
5.5.1 Setup Code
# Load packages used in this session of R
library(knitr)
library(tidyverse)
library(ggplot2)
$set(echo = TRUE)
opts_chunkoptions(digits = 2)
5.5.2 Preparation Code
<- data.frame("age" = c("18 to 29", "36 to 50", "51 to 64", "65+"),
df "popPct" = c(29, 21, 30, 20),
"svyPct" = c(19, 21, 32, 28))
kable(df, caption = "Table: Population and survey percentages by age group")
5.5.3 Base R Plot Code
<- c(19, 29)
Age18to29 <- c(21,21)
Age36to50 <- c(32, 30)
Age51to64 <- c(28, 20)
Over65 <- cbind(Age18to29, Age36to50, Age51to64, Over65)
age_groups barplot(age_groups, beside=T, xlab="Age Group", names.arg=
c("18 - 29", "36 - 50", "51 - 64", "65+"), ylab="Percent",
main = "Percent Surveyed and Percent in Population by Age Group",
ylim = c(0,35), las=1)
legend("bottomleft",c("Surveyed %", "Population %"),
fill=c("black", "light gray"), horiz=FALSE, cex=0.73, bg="white")
5.5.4 library(ggplot)
First Plot Code
%>%
df mutate(Population = popPct, Survey = svyPct) %>%
::select(-popPct, -svyPct) %>%
dplyrpivot_longer(-age, names_to="Group", values_to="Percent") %>%
ggplot(aes(x=age, y=Percent, fill=Group)) +
geom_bar(stat="identity", position="dodge") +
scale_fill_grey() +
theme_minimal() +
labs(x = "Age Group", y = "Percent",
title = "Population and Survey Sample Proportions by Age Group")
5.5.5 library(ggplot)
Second Plot Code
%>%
df mutate(Population = popPct, Survey = svyPct) %>%
::select(-popPct, -svyPct) %>%
dplyrpivot_longer(-age, names_to="Group", values_to="Percent") %>%
ggplot(aes(x=age, y=Percent, fill=Group)) +
geom_bar(stat="identity", position="dodge") +
coord_flip() +
scale_fill_grey() +
theme_minimal() +
labs(x = "Age Group", y = "Percent",
title = "Population and Survey Sample Proportions by Age Group")
5.5.6 Alternative Plot Code
library(apyramid)
%>%
df mutate(Population = popPct, Survey = svyPct) %>%
::select(-popPct, -svyPct) %>%
dplyrpivot_longer(-age, names_to="Group", values_to="Percent") %>%
mutate(age = as.factor(age)) %>%
age_pyramid(data = ., age_group = "age", split_by = "Group",
count = "Percent", show_midpoint = FALSE) +
scale_fill_grey() +
theme_minimal() +
labs(x="Age Group", y="Percent", fill=NULL,
title = "Percent Surveyed and Percent in Population by Age Group")