A webR tutorial

The Comeback Kids — Part III

A return to the exam data

With knowledge of the concept of regression to the mean in your back pocket. Let’s take a look at how the exams data from the first Come back Kids webR exercise in this series were simulated. Press Run Code to create the data frame.

The code tells the story of the data generating process. Each student is assumed to have a true ability drawn from a normal distribution with a mean of 50 and a standard deviation of 10. This true ability reflects the underlying skill or knowledge of the student, which remains constant across both the midterm and final exams.

  1. Midterm Exam Scores:
    Each student’s score on the midterm exam is the sum of two components: their true ability and a random component, which has a mean of 0 and a standard deviation of 10. This random component reflects the inherent unpredictability in performance, meaning that the midterm exam is not a perfect measuring instrument—it captures the student’s ability with some noise or error.

  2. Final Exam Scores:
    Similarly, each student’s score on the final exam is the sum of their true ability and an independent random component with a mean of 0 and a standard deviation of 10. This independent noise reflects that, just like the midterm, the final exam is also an imperfect measure of the student’s true ability.

Since the random components in the midterm and final exams are independent, it is possible for a student to perform better or worse than their true ability on either exam, but the variability (noise) is random and uncorrelated across the two exams.

This setup illustrates the principle that even with stable true ability, observed scores fluctuate due to random noise, making each exam an imperfect measure of the underlying ability.

Importantly, there is no specific provision in the data generating process for the poorest-performing students to get a treatment effect (i.e., benefit) from the extra tutoring. What we observed in terms of their “improvement” may simply be an artifact of regression to the mean.

What does this mean?

  • No specific provision: This implies that the improvement wasn’t due to any targeted intervention for the poorest students.

  • Regression to the mean: When students perform poorly (e.g., by chance or random fluctuations), their subsequent performance often appears to improve simply because extreme scores tend to move closer to the average on subsequent measurements, without any real change in ability or circumstances.

Fit the model

To explore a bit further, let’s fit the regression model to these data — regressing final exam score on midterm score.

The estimated slope between midterm and final exam scores is 0.463, which is less than 1. This illustrates the concept of regression to the mean. In simple terms, students who scored high on the midterm tend to score closer to the average on the final—they don’t perform quite as exceptionally the second time. Similarly, students who did poorly on the midterm usually improve on the final, though they may still be below average.

Regression to the mean

It’s tempting to come up with explanations for this pattern. One might think that high-scoring students became overconfident and studied less for the final, leading to lower scores. Alternatively, perhaps low-scoring students were motivated to work harder or benefited from extra tutoring, boosting their final exam performance. However, in this case, the data were simulated from a model without any of these factors. Both the midterm and final scores were based solely on each student’s true ability plus some random variation.

The phenomenon of regression to the mean occurs because of natural fluctuations between the two exams. A student who does exceptionally well on the midterm may have had a combination of skill and a bit of luck. On the final, they might not have the same favorable conditions, so their score moves closer to the average. The key takeaway is that without recognizing regression to the mean, we might mistakenly attribute these changes to specific causes, when they’re actually just statistical artifacts — sometimes referred to as the “regression fallacy.”

We can also take a look at the graph to further understand regression to the mean for this example:

Important

Take a few moments to describe the phenomena of regression to the mean in your own words. Can you think of examples in your field where a scientist might be inadvertently fooled by regression to the mean?

Credits

This example comes from the wonderful book called Regression and Other Stories by Drs. Gelman, Hill and Vehtari.