Apply and Practice Activity

The Eroding of the American Dream

Introduction

For this activity you are going to recreate the first graph presented in this article by FiveThirtyEight. It is also pasted here:

Please read the FiveThirtyEight article, and then take a moment to study the graph and understand the story it is telling. It is based on a Science paper by Dr. Raj Chetty and colleagues. Dr. Chetty is a Professor of Economics at Harvard University and directs Opportunity Insights. He and his team are leaders in using big data to understand and mitigate economic inequality. They have a great deal of data on their site that the public can use.

Please follow the steps below to complete this activity.

Step by step directions

Step 1

Navigate to the apply_and_practice_programs folder in the programs folder of the foundations project. Open up the file called american_dream.qmd.

To ensure you are working in a fresh session, close any other open tabs (save them if needed). Click the down arrow beside the Run button toward the top of your screen then click Restart R and Clear Output.

Once the .qmd file is open, add your name to the author section of the YAML metadata.

Step 2

Load the packages that are needed for this activity. Find the code chunk labeled: Load packages.

Notice that in the code chunk in your .qmd file, there is a line that start with #|. This is a code block direction. This is used to suppress the messages produced by R when the packages are loaded, so that they do not appear when we render our notebook to a final report. This is a nice feature if you do not want certain unnecessary messages to clutter up your final report. I tend to build my analysis notebook without these directions, and then add them when I feel confident everything is working as it should and I’m ready to render my full report.

There are many useful code block options — you can learn about them here.

Do note that the code block directions must start right at the very top of the code chunk — and they must be listed one on top of the other with no spaces in between. Like nearly everything in R, these must be written precisely — and spaces do matter for these.

Click run on the Load packages code chunk. Now, the packages are ready for you to use.

Step 3

Next, we need to import the data file called mobility538.Rds, which is in the data folder of the foundations project.

The mobility538.Rds data frame has three variables:

Variable Description
cohort Birth year
age30_absmob The proportion of people who earned more than their parents at age 30 (after tax and adjusting for inflation).
age40_absmob The proportion of people who earned more than their parents at age 40 (after tax and adjusting for inflation).
mobility538 <- read_rds(here("data", "mobility538.Rds"))
mobility538

Find the code chunk labeled Import data. Click play to run this code chunk. Now, a data frame called mobility538 will appear in the upper right part of your screen, under the Environment tab. Click on it and take a look at the data frame. Click the \(\times\) beside the data frame name to close it.

Step 4

Find the code chunk labeled: Take a glimpse of the data.

Use the glimpse() function to obtain some additional information about the data frame. Click run, and take a look at the output.

mobility538 |> glimpse()
Rows: 45
Columns: 3
$ cohort       <dbl> 1940, 1941, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 194…
$ age30_absmob <dbl> 0.906, 0.884, 0.895, 0.889, 0.900, 0.869, 0.861, 0.841, 0…
$ age40_absmob <dbl> 0.858, 0.789, 0.764, 0.727, 0.740, 0.722, 0.730, 0.748, 0…

Step 5

Now we’re ready to build the graph.

You have the tools to build the graph for this activity. Additionally, this ggplot2 cheatsheet is a great resources for learning about the options available.

Version 1

In the code chunk labeled Version 1, fill in the missing code (denoted as XXX) to create a line plot that aesthetically maps cohort to the x-axis and age30_absmob to the y-axis (i.e., age 30 mobility statistics). We’ll mimic the colors from the FiveThirtyEight graph. Note that mobility for age 30 is colored red — we can get a very similar shade with the following hex code: #FC4F30. You can type in any hex code and see the color here. Alternatively, you could use one of R’s built in color names (e.g., color = “red”).

Now, take a look at the FiveThirtyEight graph, and duplicate the title, subtitle, and both the x- and y-axis labels.

Notice that on the subtitle, the title returns onto the next line. You can achieve this by using the operator \n. You just put this operator where you want to return to a new line.

For example, notice the \n before “same age”.

subtitle = "Average probability of earning more than your parents at the \nsame age, by year of birth",

Press play to run the code chunk, and take a look at the graph.

Please go back to the code chunk you just finished, and change lwd = 1.5 to lwd = 3. Notice what happens. Select a line width that you like best.

Version 2

Copy and paste your completed code chunk from Version 1 in the code chunk for Version 2.

Now, add a second geom_line() call that swaps age40_absmob for age30_absmob. Color this line with the following color: #30A2DA

This will add an additional line to your graph to represent age 40 mobility.

Press play to run the code chunk. You should now have 2 lines — one red (for age 30) and one blue (for age 40). Take a look at the graph, what are the similarities and differences to the FiveThirtyEight graph in the article?

Notice that running this code produces a warning message that some cases are dropped. This is expected because some cohorts are missing the age 40 mobility data because they aren’t yet old enough. If you’d like to suppress this message from appearing in your report — then you can add the following code block options:

#| warning: false

#| message: false

Be sure to list these right at the top of the code chunk (after the three back ticks and curly braces with r)

Version 3

The Version 2 graph looks similar to the graph in the FiveThirtyEight article, but we can add a few enhancements.

First, let’s label the lines so a reader knows what each line represents. Copy and paste the Version 2 code into the Version 3 code chunk.

Then, use the annotate() function to label the lines. Copy and paste the following code after the second geom_line() call.

annotate(geom = "text", x = 1952, y = .9, label = "at age 30", color = "#FC4F30", fontface = "bold") +
annotate(geom = "text", x = 1950, y = .55, label = "at age 40", color = "#30A2DA", fontface = "bold") +

Run the code.

Notice where the line labels are placed on the graph. Can you tell what elements of the annotate() code dictated where the labels land on the graph?

Version 4

As a final enhancement, please change the y-axis so that it’s similar to the FiveThirtyEight axis. Copy and paste the code from Version 3 into the code chunk for Version 4.

Now, we need to do two things. First, we need to make the y-axis range from 0 to 1 (the default is to present a range based on the observed data). Second, we need to translate the units from a proportion to a percentage.

The scale_y_continuous() function accomplishes these two tasks.

Please add the following line of code after the second annotate line:

scale_y_continuous(limits = c(0, 1), label = scales::percent_format()) +

Press play to run the code.

After completion of this final graph, study the graph that you’ve made, and write a few sentences to interpret the graph in your own words.

Step 6

Save the graph as a stand alone file.

We can save the results of the graph using ggsave(). Below the ggplot command lines in the last code chunk, add the following:

ggsave("my_graph.png")

Then, click play on the code chunk. This will create a png file of your graph in the same folder as your american_dream.qmd file. Find it in the files tab of your RStudio session. You can click on it and it will open up.

There are many great ggsave() options — click here to explore.

If you’re stuck, here is the full code:

mobility538 |> 
  ggplot(mapping = aes(x = cohort)) +
  geom_line(mapping = aes(y = age30_absmob), color = "#fc4f30", lwd = 1.5) +
  geom_line(mapping = aes(y = age40_absmob), color = "#30a2da", lwd = 1.5) +
  annotate(geom = "text", x = 1952, y = .9, label = "at age 30", color = "#fc4f30", fontface = "bold") + 
  annotate(geom = "text", x = 1950, y = .55, label = "at age 40", color = "#30a2da", fontface = "bold") +
  scale_y_continuous(limits = c(0, 1), label = scales::percent_format()) +
  labs(title = "The eroding of the American dream",
       subtitle = "Average probability of earning more than your parents at the \nsame age, by year of birth",
       x = "Year of birth",
       y = "Average probability") 

ggsave("my_graph.png")

Once you create the graph, it will appear in the same folder as the .qmd file. Navigate to the files tab (bottom right section of your screen), and find the my_graph.png file that you created – it will be in the same folder as the american_dream.qmd file that you edited to create the graph (i.e., programs -> apply_and_practice_programs). Double click on the my_graph.png file to inspect it and make sure that everything looks the way you want it to look. For example, if a title is running off the graph, you can go back and modify the offending title with the \n operator discussed in Step 5. Modify and recreate the saved graph if needed.

Step 7

Finalize and submit.

Now that you’ve completed all tasks, to help ensure reproducibility, click the down arrow beside the Run button toward the top of your screen then click Restart R and Clear Output. Scroll through your notebook and see that all of the output is now gone. Now, click the down arrow beside the Run button again, then click Restart R and Run All Chunks. Scroll through the file and make sure that everything ran as you would expect. You will find a red bar on the side of a code chunk if an error has occurred. Taking this step ensures that all code chunks are running from top to bottom, in the intended sequence, and producing output that will be reproduced the next time you work on this project.

Now that all code chunks are working as you’d like, click Render. This will create an .html output of your report. Scroll through to make sure everything is correct. The .html output file will be saved along side the corresponding .qmd notebook file.

Follow the directions on Canvas for the Apply and Practice Assignment entitled “American Dream Apply and Practice Activity” to get credit for completing this assignment.