Equal vs Proportional Marginal means: When They Matter, and When They Don’t

Biostats
Author

Imanol Zubizarreta

Published

May 4, 2025

Introduction

In clinical trials, least-square means (LSMs) or marginalized means are often used to summarize adjusted treatment effects after fitting a model that includes prognostic covariates.

These model-based means average predicted outcomes across subgroups defined by covariates (e.g. disease severity), but a key question arises:

How should these marginal means be averaged across covariate levels?

In R, the emmeans() function is a widely used tool for computing LSMs. It offers two common weighting options for averaging over covariates:

  • "equal" weighting: averages subgroup means equally (e.g., assumes 50% Severe, 50% Non-Severe),
  • "proportional" weighting: uses the observed covariate distribution in the sample (e.g., 20% Severe, 80% Non-Severe).

In this post, we walk through two scenarios — single-arm trials and randomized controlled trials (RCTs) — and show when the choice of weighting matters, and when it doesn’t.


Scenario 1: Single-Arm Trial (No Control Group)

Imagine a single-arm trial where we wish to estimate the mean outcome, adjusting for severity (Severe vs Non-Severe). In real use case you would adjust for baseline value etc. This would likely narrow the confidence intervals by reducing the model’s residual noise, if baseline status is prognostic of outcome, similar to the severity adjustment. Let’s make it simple and only consider the severity covariate adjustment for this example. 😉

Suppose: - Severe cases are rare (e.g., 25% Severe), - Severity is strongly prognostic, - No treatment by severity interaction (only severity impacts the outcome).

We fit a linear model adjusting for severity.

\[ Y = \beta_0 + \beta_1 \cdot \text{TRT} + \beta_2 \cdot \text{SEVERITY} + \epsilon \]

We compute the marginal mean across severity levels using both equal and proportional weighting.


Equal weighting (50% Severe, 50% Non-Severe):

\[ \text{Marginal Mean}_{\text{equal}} = \beta_0 + \beta_1 + 0.5 \times \beta_2 \]

where 0.5 reflects the assumption of 50% Severe cases.


Proportional weighting (5/25 Severe):

Given that 5 out of 25 patients are Severe (20% Severe):

\[ \text{Marginal Mean}_{\text{proportional}} = \beta_0 + \beta_1 + 0.2 \times \beta_2 \]

where 0.2 reflects the observed severity proportion in the study and hopefully pretty much represents the true distribution.


The observed mean change from baseline in the simulated data set is -7.33 with a 95% CI of -10.13 to -4.53.

See now the overall marginalized means by using the two weighting approaches. The 50/50 weighting leads to a worse marginalized mean and inflated standard errors compared to using the observed 20% Severe distribution. Assuming that 20% Severe cases reflects the true prevalence in the population, proportional weighting is the appropriate choice for marginalizing the means. In fact, under this setting, the proportional LSM is equivalent to the observed raw mean but with narrower confidence intervals when you adjust for severity.

                  weighting   estimate  lower.CL  upper.CL       SE
1         Observed Raw Mean  -7.329712 -10.12538 -4.534042 1.354558
2        Equal Distribution -10.420000 -13.23000 -7.620000 1.350000
3 Proportional Distribution  -7.330000  -9.57000 -5.090000 1.080000

Key Points

  • In a single-arm trial, the focus is purely on marginalizing over severity to report a mean outcome.
  • Using "equal" weighting forces a 50/50 severity distribution, which may substantially overestimate the true prevalence of Severe cases (if the study sample mirrors the real population!) — and you may be in shock when you see the marginalized outcome significantly worsened compared to the non-model based observed mean outcome 😅.
  • Using "proportional" weighting reflects the actual severity distribution observed in the trial, which may be more realistic, if the study sample is a good representation of the real population.

Conclusion

In single-arm trials, proportional weighting is preferred unless you explicitly want to pretend the subgroups are evenly represented.


Scenario 2: “Badly stratified” Randomized-Controlled Trial

Now imagine a two-arm RCT comparing Treatment vs Placebo, but randomization did not stratify by severity or they badly stratified by severity.

Suppose: - Treatment group has 90% Severe patients, - Placebo group has 10% Severe patients, - Severity is strongly prognostic.

We fit models adjusting for severity:

Case 1: No Interaction

Model:

Without interaction, the model is:

\[ Y = \beta_0 + \beta_1 \cdot \text{TRT} + \beta_2 \cdot \text{SEVERITY} + \epsilon \]

Predicted least-squares means (LSMs):

  • Treatment group:

\[ \text{LSM}_{\text{Treatment}} = \beta_0 + \beta_1 + w_{\text{S}} \cdot \beta_2 \]

  • Placebo group:

\[ \text{LSM}_{\text{Placebo}} = \beta_0 + w_{\text{S}} \cdot \beta_2 \]

Thus, the treatment contrast is:

\[ \text{Treatment Difference} = (\beta_0 + \beta_1 + w_{\text{S}} \cdot \beta_2) - (\beta_0 + w_{\text{S}} \cdot \beta_2) = \beta_1 \]

And thereby, the treatment contrast is exactly \(\beta_1\) — independent of \(w_{\text{s}}\).


Note:
In real life you would adjust for baseline value etc.

  • Treatment effect is constant across severity.
  • In this case:
    • "equal" and "proportional" weights give exactly the same treatment contrast.

Note:
In randomized controlled trials, where treatment groups are balanced by design (especially in large samples), the difference between equal and proportional LSMs should be minimal even if the covariate is prognostic.
However, in non-randomized or poorly stratified designs, imbalance could exist, but without an interaction, the marginal contrasts still remain equal between weighting strategies.


Case 2: Interaction Present

Model:

\[ Y = \beta_0 + \beta_1 \cdot \text{TRT} + \beta_2 \cdot \text{SEVERITY} + \beta_3 \cdot (\text{TRT} \times \text{SEVERITY}) + \epsilon \]

The treatment contrast becomes:

\[ \text{Treatment Difference} = \beta_1 + w_{\text{S}} \times \beta_3 \]

where \(w_{\text{s}}\) is the weight assigned to Severe cases (depends on equal vs proportional weighting).

  • Equal weighting uses \(w_{\text{s}}\) = 0.5 (by assumption),
  • Proportional weighting uses \(w_{\text{s}}\) = observed Severe proportion in the trial.

Thus, the treatment contrast will differ based on the severity distribution and the weighting scheme used.

Conclusion

  • If there’s an interaction, the choice between equal and proportional LSMs affects the treatment effect.

  • Proportional weighting reflects the real observed severity distribution — and thus is often preferred when modeling a trial population.


Final Takeaways

Scenario Equal vs Proportional Matter? Recommendation
Single-Arm Trial Yes Prefer proportional if sample is a good representation of the population
RCT, no interaction No Either is fine
RCT, with interaction Yes Prefer proportional to match population

Once said this, if you are using emmeans() in your analysis, be sure to check the weighting approach you are using as weights parameter takes "equal" by default. 👀