A webR tutorial

Factorial Experiments

The scenario

In this activity, we will work with data from this study conducted by Drs. Jake Hofman, Daniel Goldstein & Jessica Hullman. In Module 16, you studied data from their Experiment 1. Here, we’ll use data from Experiment 2.

Participants imagined competing in a boulder-sliding game against an equally skilled opponent named Blorg. They could rent a “special boulder” — a possible upgrade. Participants viewed one of four uncertainty visualizations about the special boulder and then reported their perceived probability of winning with it (our outcome: superiority_special — which ranges between 0 and 1).

Additionally, participants were randomly assigned to see either a small effect (probability of winning with the special boulder = 0.57, where 0.5 indicates even odds) or a large effect (probability of winning with the special boulder = 0.76).

In this activity, we’ll consider the joint effect of graph type and effect size.

Import the data

Variables

  • graph_type – CI, PI, CI rescaled, or HOPS (randomly assigned)
  • effect_size – “Small” (true win prob ≈ 0.57) or “Large” (true win prob ≈ 0.76)
  • superiority_special – participant’s perceived probability the special boulder would help them win (ranges from 0–1)

Research questions

  1. Do perceived probabilities that the boulder will help the participant win increase when the effect size is large?
  2. Do perceptions differ by visualization type?
  3. Does the difference between visualization types depend on effect size (an interaction)?

Hypotheses

  • H1: Larger effect sizes (vs. smaller ones) will generally lead to higher perceived probabilities of winning, as the expected advantage appears greater.

  • H2: Different visualization types will influence perceived probabilities, as each conveys uncertainty in a unique way. Specifically, the CI graph is expected to lead to greater misconceptions of the true superiority of the special boulder compared to the PI, CI rescaled, and HOPS graphs.

  • H3: There may be an interaction between visualization type and effect size; a certain effect size might amplify or dampen the perceived impact of graph type on the perceived superiority of the special boulder.

Fit the two-way factorial model

In this model, we’re testing whether the relationship between graph type and participants’ perceived superiority of the special boulder depends on the effect size shown in the visualization. In other words, we’re asking whether one variable (effect size) modifies the effect of the other variable (graph type) on the outcome — this is what we call effect modification or an interaction. By including the graph_type*effect_size term, we allow the slopes (or differences) associated with one factor to vary across the levels of the other.

Interpretation notes:

  • The intercept is the mean for the CI graph + small effect size (baseline group).
  • Main effects show differences from that baseline when the other variable is at its reference.
  • Interaction terms show how those differences change when the effect size changes.

Summary insight:
When uncertainty is high (small effect size), visualization type strongly shapes perception — participants are more skeptical when viewing PI or HOPS graphs. When uncertainty is low (large effect size), the visual format matters less — the clear evidence of an effect overrides how uncertainty is displayed. This pattern reflects effect modification: the influence of graph type depends on the underlying effect size.

Detailed interpretations
  • (Intercept): The intercept of 0.859 represents the average perceived probability that the special boulder would help participants win (i.e., perceived superiority) for those who viewed the CI graph under the small effect size condition — the reference levels for both factors. This serves as the baseline group.

  • graph_typePI: Among participants who saw the small effect size, those who viewed the PI graph had an average perceived winning probability that was 0.199 lower than those who viewed the CI graph. Thus, with small effects, the PI visualization reduced participants’ confidence in the boulder’s advantage.

  • graph_typeCI rescaled: Under the small effect size, those who viewed the CI rescaled graph rated their perceived winning probability 0.036 lower than the CI group. This small but significant difference suggests that rescaling slightly reduced perceived certainty compared to the unscaled CI.

  • graph_typeHOPS: Under the small effect size, those who viewed the HOPS graph rated their perceived probability of winning 0.205 lower than those who viewed the CI graph — nearly identical to the PI effect. Both PI and HOPS visualizations therefore made participants less confident in the boulder’s advantage when uncertainty was high.

  • effect_sizeLarge effect size: Among participants who viewed the CI graph, those who saw a large effect size reported an average perceived probability of winning that was 0.028 higher than those who saw a small effect size. Viewing a stronger underlying effect modestly boosted confidence in the boulder’s advantage.

  • graph_typePI × effect_sizeLarge effect size: The positive coefficient (0.045) indicates that the negative effect of the PI graph (vs. CI) was weaker when the effect size was large. The CI–PI difference decreases from –0.199 (small effect size) to –0.154 (large effect size). Thus, the influence of graph type on perception diminishes as the effect size becomes more obvious — a moderation pattern consistent with reduced visual impact when the signal is strong.

  • graph_typeCI rescaled × effect_sizeLarge effect size: The positive coefficient (0.032) means that the small negative effect of the CI rescaled graph nearly disappears under the large effect size. The difference between CI rescaled and CI changes from –0.036 (small) to –0.004 (large), a non-significant difference, suggesting that at high effect sizes, the rescaling choice has little perceptual impact.

  • graph_typeHOPS × effect_sizeLarge effect size: The positive coefficient (0.064) shows that the negative effect of the HOPS graph (vs. CI) is also reduced when the effect size is large. The CI–HOPS difference shrinks from –0.205 to –0.141. In other words, when the effect size is large, participants’ perceptions across graph types become more similar, though HOPS still yields slightly lower perceived probabilities than CI.

  • Model fit: The model’s R² = 0.30, meaning that roughly 30% of the variability in participants’ perceived superiority ratings is explained by the visualization type, effect size, and their interaction.

After fitting our factorial model, we’ll use the marginaleffects to interpret and visualize the effects. Instead of focusing only on raw coefficients, which can be difficult to interpret when interactions are present, marginaleffects calculates estimated differences, predicted means, and average effects in a way that directly reflects the model’s structure. This makes it easier to see how the two factors (graph type and effect size) combine to influence the outcome.

Simple effects

We’ll start by examining simple effects, which show the effect of one factor at each level of the other factor. For example, we’ll look at how perceived superiority changes when moving from a small to a large effect size within each graph type, and how different graph types compare within each effect size condition. This approach helps us unpack the interaction (effect modification) we included in the model — that is, how the relationship between graph type and perceived superiority depends on the effect size shown. By isolating these simple effects, we can see exactly where the interaction occurs and how strong it is across the different visualization conditions.

Effect of Effect Size within each Graph Type

These results show how participants’ perceived probability of winning with the special boulder changed when the true effect size increased from small to large, within each graph type. For the CI graph, perceptions rose modestly by 0.03 points, suggesting that when participants saw a large effect size, they were only slightly more confident that the special boulder would help them win.

However, for the PI, CI rescaled, and HOPS graphs, the increase in perceived superiority was notably larger — approximately 0.07, 0.06, and 0.09, respectively — and statistically significant (p < .001). This pattern indicates that participants’ perceptions were more sensitive to changes in the displayed effect size when viewing these uncertainty visualizations.

In other words, the type of graph moderated the effect of effect size: the visualizations that better represented uncertainty (like PI and HOPS) led participants to adjust their judgments more strongly when the true effect size increased, whereas the CI graph produced relatively stable perceptions regardless of the underlying effect size.

Effect of Graph Type within each Effect Size

These results compare each graph type to the reference graph type (CI) within the small and large effect size conditions.

When the effect size was small, participants who viewed the PI or HOPS graphs reported substantially lower perceived winning probabilities (−0.20 and −0.21, respectively) than those who saw the CI graph, both highly significant (p < .001). The CI rescaled graph also produced a small but significant decrease (−0.04, p = .004). This pattern suggests that when the effect size was modest, visualizations emphasizing uncertainty (PI, HOPS) led participants to be less confident in the special boulder’s advantage compared to the CI graph, which tends to underrepresent uncertainty.

When the effect size was large, the gap between graph types narrowed. The PI and HOPS graphs still produced lower perceived winning probabilities than the CI graph (−0.15 and −0.14, respectively), but these differences were smaller than under the small effect size condition. The CI rescaled graph no longer differed significantly from the CI graph.

Overall, this pattern shows a dampening interaction: when the effect size is large (i.e., when the advantage is clearly visible), graph type matters less. When the effect size is small and uncertainty is greater, the choice of visualization strongly influences how confident participants feel about the boulder’s superiority — a hallmark of effect modification.

Marginal averaged effects

Marginal, or averaged, effects summarize how one variable influences the outcome on average across the levels of the other variable in the model. Instead of examining separate simple effects within each condition, marginal effects provide a single, overall estimate of how much the outcome changes when one variable changes — while statistically averaging over the distribution of the other variable(s).

In this example, we’ll use marginal (averaged) effects to answer broader questions like:

  • On average, how much higher are perceived winning probabilities when the effect size is large rather than small — regardless of which graph type participants saw?

  • On average, how does each visualization type differ from the CI graph — regardless of the effect size displayed?

This approach complements the simple effects analysis: while simple effects help us see where and how strongly the interaction occurs, marginal effects help us understand the overall or main trends across all conditions.

Averaged effect of Effect Size

This effect means that, on average, viewing a large effect size (compared to a small effect size) increases the perceived superiority rating by approximately 0.063, averaged over all graph types.

  • Why Compute Marginal Effects?

    Marginal effects provide a summary measure of the effect of an independent variable, averaging over the levels of other variables. This is particularly useful when interactions are present in the model, as it allows us to understand the general effect of a variable across different conditions.

  • Assessing Hypotheses:

    By computing the marginal effect of effect_size, we directly test H1. If the estimate is positive and significant, it indicates that larger effect sizes generally lead to higher perceived probabilities of winning, regardless of the graph type.

Averaged effect of Graph Type

These results represent the average difference in perceived winning probability between each visualization type and the reference group (the CI graph), averaged across both effect sizes.

  • CI rescaled − CI = −0.020
    On average, participants who viewed the CI rescaled graph rated the special boulder’s probability of helping them win about 0.02 points lower than those who viewed the CI graph. Although this difference is small, it is statistically significant, suggesting that even subtle rescaling of the CI reduces perceived certainty.

  • HOPS − CI = −0.173
    Across both effect sizes, participants who viewed the HOPS graph perceived the special boulder’s advantage to be about 0.17 points lower than those who saw the CI graph. This large, significant difference shows that when uncertainty is communicated dynamically (as in HOPS), people become much more cautious about how much the boulder will help them win.

  • PI − CI = −0.176
    Similarly, participants who viewed the PI graph rated the special boulder’s superiority roughly 0.18 points lower than those who saw the CI graph. Like HOPS, the PI graph makes uncertainty more salient, leading to lower confidence in the boulder’s benefit.

Visualizing predicted means

Important

To grow your understanding of the fitted model, try to map the estimates that we’ve computed in this activity onto the graph.

Discussion prompts

  • Does a larger true effect size (0.76 vs 0.57) lead to higher perceived probabilities overall?

  • Do different visualization types systematically increase or reduce perceived superiority?

  • Is there evidence that the influence of graph type shrinks when the effect size is large (interaction)?

  • Compare the simple and marginal effects—how do they tell complementary parts of the same story?