DATA & STATISTICS

Mediation and moderation

Mediation: variable M explains HOW X affects Y (causal mechanism). Moderation: variable W modifies WHEN or FOR WHOM the effect of X on Y occurs (interaction). Distinction formalized by Baron and Kenny (1986); modern approach via Hayes (2018).

Extended definition

Mediation and moderation are two distinct analytical concepts answering different causal questions. Mediation investigates how an independent variable XX affects a dependent YY: there is an intermediate variable MM through which the effect is transmitted. The classical decomposition of total effect cc is:

c=c+abc = c' + ab

where aa is the effect of XX on MM, bb the effect of MM on YY controlling for XX, abab the indirect effect, and cc' the remaining direct effect. Moderation investigates when or for whom the effect of XX on YY occurs: a variable WW alters the magnitude or direction of that effect (interaction X×WX \times W). Baron and Kenny (1986, JPSP) formalized the distinction and proposed the sequential mediation test that dominated social sciences for decades; the modern approach uses bootstrap percentile confidence intervals for the indirect effect (Preacher & Hayes; Hayes, 2018, Guilford Press) for being more robust and powerful, and not requiring normality of the product abab. Models combine mediation and moderation in moderated mediation and mediated moderation — conditional process analysis.

When it applies

Mediation applies when theory posits a specific causal mechanism: stress → elevated cortisol → impaired cognitive performance; educational intervention → increased self-efficacy → improved performance. Moderation applies when theory posits that effect varies by subgroup or context: drug effect depends on genotype; incentive effect varies by intrinsic motivation level; campaign effect varies by age. Both are standard in social sciences (psychology, sociology, management), public health, marketing. In ML, analogous concepts appear in interpretability (feature interactions) and in causal inference (CATE — Conditional Average Treatment Effect).

When it does not apply

It does not apply to purely correlational studies without a design enabling causal inference: mediation assumes causal direction XMYX \to M \to Y; without experimental manipulation or adequate confounder control, statistical “mediation” may reflect spurious associations. It does not apply to small nn: both mediation (especially with bootstrap) and moderation require substantial samples to detect plausible effects with adequate power. It does not replace experimental manipulation of the mediator when the question demands causal rigor — observational mediation analysis is hypothesis, not conclusion. It does not apply to moderators with high collinearity with XX: interaction term estimation becomes unstable.

Applications by field

Psychology: conditional process analysis is structural in contemporary research; Hayes’s PROCESS macro widely used. — Public health: intervention mechanisms (mediation) and benefiting populations (moderation) in trials. — Marketing: moderation by demographics, attitude, cultural context in consumer behavior models. — Education: analysis of intervention × student-characteristic interactions (moderation); learning processes (mediation).

Common pitfalls

The first pitfall is interpreting statistical mediation in observational designs as causal evidence: XMYX \to M \to Y association can be explained by common confounders or reverse causation. The second is using the Baron-Kenny (1986) sequential test without complementing with bootstrap: the product abab is rarely normal, and the Sobel test is less powerful than percentile bootstrap. The third is centering variables in moderation tests: Aiken & West (1991) recommend centering XX and WW to reduce interaction-term collinearity and ease interpretation. The fourth is interpreting absence of direct effect cc' after mediation as “complete mediation” carelessly: it may reflect low power, not genuine absence. The fifth is confusing serial mediation (M1 → M2 → Y) with parallel mediation (simultaneous M1 and M2) — different models with distinct interpretations; specify before analysis.

Last updated —