PSY 652: Research Methods in Psychology I

Polynomial Regression

Kimberly L. Henry: kim.henry@colostate.edu

A primer on polynomials

The basic regression model assumes that the relationship between x and y is linear. However, in some cases the effect of a given predictor may differ by levels of that very predictor.

That is, the “effect of x” differs as x increases.

A curvilinear relationship

Consider the relationship between age and health care expenditures.

Polynomial regression

Polynomial regression can model the relationship between x and y if it’s not merely linear by incorporating higher order terms of x.

This accounts for the shifting influence of x on y at various x values.

\(y\) regressed on \(x\) and \(x^2\) accommodates a single curve bend (quadratic model).
\(y\) regressed on \(x\), \(x^2\), and \(x^3\) accommodates two bends (cubic model).
Each subsequent higher order term allows for an additional bend.

A simple example

40 participants are randomly assigned to varying levels of minutes spent practicing for a visual discrimination test (0, 2, 4, 6, 8, 10, 12, or 14 minutes). Subsequently, a test on visual discrimination is conducted, and each participant’s score on the test is recorded.

The experiment revolves around two variables:

practice: the experimentally assigned duration of practice
score: the score on the test.

Press Run Code to import the data and take a look at the scores:

Is their a linear relationship?

Press Run Code to examine the linear relationship between the two variables.

Is their a quadratic relationship?

Press Run Code to consider if a curvilinear relationship fits the data better.

Polynomials to the rescue

What is the approach?

Test a series of models, each progressively incorporating an additional polynomial term (e.g., \(x\), \(x^2\), \(x^3\), etc.).
We strive to adopt the most parsimonious model, so we begin with the simplest model and continue testing higher-order models until we find that the latest added term no longer noticeably enhances the model’s fit to the data.
At this point, we opt for the previous model, where the highest-order term significantly influenced the model’s ability to capture the curvature in the relationship.
Crucially, all lower-order terms are retained in the model if the highest-order term is deemed necessary, regardless of whether the estimates of the lower terms have values close to zero.

Create the polynomial terms

First, we need to create the polynomial terms, we’ll consider up to 2 bends.

Press Run Code on the code chunk below to produce a squared term and cubic term for minutes spent practicing.

Fit the linear model

Press Run Code on the code chunk to fit a linear model.

The intercept is the predicted score for someone who receives no practice.
The slope is the predicted change in the score for a one unit increase in minutes spent practicing.
It is meaningfully different from zero, indicating a positive linear trend — each additional minute spent practicing is associated with a 1.5 unit increase in performance.

Linear model fit

Press Run Code on the code chunk to assess the \(R^2\) and sigma for the linear model.

The \(R^2\) is .86 – indicating that about 86% of the variability in the performance score is predicted by this linear trend.
Sigma, which represents the standard deviation of the residuals, is 2.95. The standard deviation of score is 7.67 – indicating a substantial reduction in variability once the linear trend is accounted for.

Fit the quadratic model

Pres Run Code on the code chunk below to examine whether a quadratic model better fits the data. That is: Would a curvilinear model with a single bend better fit the data?

Notice that the quadratic term (practice2) is meaningfully different from zero (-.14 with a 95% CI that doesn’t include 0) and the \(R^2\) increases quite a bit (from .86 in the linear model to .97 in the quadratic model).
In addition, sigma is further reduced.

Fit the cubic model

Pres Run Code on the code chunk below to examine whether a cubic model better fits the data — that is: Do we need a third bend to describe the relationship?

The cubic term (practice3) is very small (-.003).
The \(R^2\) and sigma barely budged as compared to the quadratic model.

The quadratic model is our winner!

Interpretation of quadratic model

\[ \hat{y_i} = {b_0} + ({b_1}\times{x_i}) + ({b_2}\times{x^2_i}) \]

\[ \hat{y_i} = 1.703 + (3.517\times{x_i}) + (-.142\times{x^2_i}) \]

Where \({x_i}\) refers to practice and \({x^2_i}\) refers to practice2.

The intercept

The estimate for the intercept is the predicted test score for people who practice 0 minutes (i.e., practice = 0). That is, if someone doesn’t practice, we predict they will score a 1.7 on the test.

The shape of the curve

The graph of a quadratic regression model is the shape of a parabola.
Th parabola can be mound shaped (i.e., an inverted U — also called concave) or bowl shaped (i.e., a U — also called convex). Our example is mound shaped.
The sign of the squared term (i.e., \({b_2}\) in the equation or the estimate for practice2, which is -.14 in our example) indicates whether the shape is a mound or bowl.
- If the squared term is positive, then the parabola is bowl-shaped (U-shaped).
- If it’s negative, the parabola is mound-shaped (inverted U-shaped).

Changing slopes

\[ \hat{y_i} = {b_0} + ({b_1}\times{x_i}) + ({b_2}\times{x^2_i}) \]

There is not one slope that relates practice to score – rather there are many slopes depending on the level of x. The slope of a line drawn tangent to the parabola at a certain x is estimated by:

\[ {b_1} + (2\times{b_2}\times{x}) \]

\[ 3.517 + (2\times-.142\times{x}) \]

Slope when practice = 0

\[ {b_1} + (2\times{b_2}\times{x}) \] \[ 3.517 + (2\times-.142\times{x}) \]

\[ 3.517 + (2\times-.142\times0) = 3.52 \]

Changing slopes

When practice = 0 the slope is: \(3.517 + (2\times-.142\times0) = 3.52\)
When practice = 8 the slope is: \(3.517 + (2\times-.142\times8) = 1.25\)
When practice = 14 the slope is: \(3.517 + (2\times-.142\times14) = -.46\)

A visualization of changing slopes

The vertex

There is a value of x along the curve when the slope drawn tangent to the line is 0.

In other words, this is the point at which y-hat takes a maximum value if the parabola is a mound or a minimum value if the parabola is a bowl.

This point is called the vertex and the x-coordinate of the vertex can be estimated with the following formula:

\[ {-b_1} \div (2\times{b_2}) \]

Our vertex

\[ {-b_1} \div (2\times{b_2}) \]

For our example, the vertex is: \(-3.517 \div (2\times{-.142}) = 12.39\)

In our case, the practice time at which the predicted score is maximized is about 12.4 minutes. This is the point where the effect of practicing goes from positive to negative.

A visualization of the vertex

Using marginaleffects to solve for the slopes

Press Run Code on the code chunk below to see how the marginaleffects function can calculate the changing slopes for you.