Fitting Non-Linear Relationships with lm()
In lab you’ve been exploring data from a paper which examines hidden biases in criminal sentencing.
Blair and colleagues hypothesized that sentence length should exhibit a non-linear, ramping up pattern as the seriousness of the offense increases:
Translation: The effect of offense severity should accelerate as severity increases (i.e. a non-linear relationship).
Key Variables:
Tip
Problem: The distribution is highly right-skewed (i.e., positive skew).
Tip
The relationship appears positive, but the linear best-fit line does not capture the pattern well.
Solution: Transform the outcome using the natural logarithm!
Create a new variable lnyears that is the natural log of years:
Tip
Still skewed, but much better!
Tip
This is much better! But, Blair and colleagues hypothesized a quadratic curve.
Tip
The quadratic curve seems like it is a better fit.
Let’s compare the linear fitted model to a quadratic fitted model:
Tip
Key observations:
The quadratic term (I(primlev^2)) is reliably different from zero
R² increases from the linear model (about 0.45 to 0.50)
Model: \({lnyears} = {b_0} + ({b_1} \times {primlev_i}) + ({b_2}\times {primlev_i^2})\)
Important:
The predictor, primlev, doesn’t have a meaningful 0 (1 is the lowest score). We’ll ignore this here to keep things simple — but when you replicate the five models from the Blair paper later, we’ll center this and all other predictors.
Individual coefficients in polynomial models are best interpreted using marginal effects!
\[ \hat{lnyears} = {b_0} + ({b_1} \times {primlev_i}) + ({b_2}\times {primlev_i^2}) \] \[ \hat{lnyears} = 0.765 + (-0.289 \times {primlev_i}) + (0.048\times {primlev_i^2}) \] What is the predicted score when severity equals 6?
\[ \hat{lnyears} = 0.765 + (-0.289 \times 6) + (0.048\times {6^2}) \approx 0.75 \]
Let’s request predictions of ln(years) with 95% CIs at each level of severity:
Tip
These predictions are on the log scale — ln(years).
Tip
Notice the accelerating curve and widening confidence intervals at the extremes.
To interpret predictions in years — not ln(years) — we exponentiate:
Tip
Key point: We exponentiate the predictions AND both confidence bounds.
Tip
Now we see predictions in meaningful units! The exponential transformation makes the curve even steeper.
Problem: In polynomial models, the effect of X on Y varies depending on the value of X.
Question: “What is the effect of a one-unit increase in offense severity?”
Answer: “It depends on the current level of severity!”
Solution: Calculate marginal effects (slopes) at specific values of the predictor.
We’ll calculate slopes at three levels of offense severity: 3, 6, and 9
\[ {b_1} + (2\times{b_2}\times{primlev_i}) \]
\[ {-0.28919004} + (2\times{0.04784813}\times{primlev_i}) \] Hand computation when severity equals 6:
\[ {-0.28919004} + (2\times{0.04784813}\times{6}) \approx 0.285 \]
Tip
Interpretation: These are slopes on the ln(years) scale. Notice how the slope increases as primlev increases!
The estimate column shows the instantaneous slope at each value of primlev.
At primlev = 3: slope ≈ -0.002: A 1-unit increase in severity doesn’t substantially change ln(years)
At primlev = 6: slope ≈ 0.285: A 1-unit increase in severity increases ln(years) by 0.285 units
At primlev = 9: slope ≈ 0.572: A 1-unit increase in severity increases ln(years) by 0.572 units
Pattern: The effect of offense severity accelerates as severity increases (supporting the hypothesis!).
If we want to interpret effects on the original metric (years), we can convert the coefficients to percent changes using the formula: 100 × (exp(β) - 1). This gives us the percent change in sentence length (in years) for a one-unit increase in severity. This formula is enacted in the function below.
Tip
When the primary offense is level 6 severity, a 1-unit increase in severity equate to ~33.0% increase in sentence length in years.
Tip
Now we have percent changes at each severity level!
At primlev = 3: A 1-unit increase in severity → ~ (-.2%) decrease in sentence length in years (though CI includes 0)
At primlev = 6: A 1-unit increase in severity → ~33.0% increase in sentence length in years
At primlev = 9: A 1-unit increase in severity → ~77.2% increase in sentence length in years
Key finding: The marginal effect of offense severity becomes much larger as we move to more severe offenses! This supports Blair et al.’s hypothesis of an accelerating relationship.
Percent change tells us the relative effect: “Y increases by 20%”
Slope magnitude tells us the absolute effect: “Y increases by 2 years”
Formula:
Example: At severity = 6, if:
Predicted sentence = 2.1 years
Percent change = 33%
Then slope magnitude = 2.1 × 0.33 = ~ 0.7 years
We need slope magnitude to draw tangent lines on the years scale!
Tip
When primary offense severity = 6, a one unit increase in severity (e.g., going from a 6 to a 7) is expected to increase sentence length by about 0.7 years.
The steepening tangent lines show the accelerating effect in absolute terms (years).