“Residual Confounding in Health Plan Performance Assessments: Evidence From Randomization in Medicaid”, Jacob Wallace, J. Michael McWilliams, Anthony Lollo, Janet Eaton, Chima D. Ndumele2022-03-01 (, ; backlinks)⁠:

Background: [Twitter] Risk adjustment is used widely in payment systems and performance assessments, but the extent to which it distinguishes plan or provider effects from confounding due to patient differences is typically unknown.

Objective: To assess the degree to which risk-adjusted measures of health plan performance adequately adjust for the variation across plans that arises because of differences in patient characteristics (residual confounding).

Design: Comparison between plan performance estimates based on enrollees who made plan choices (observational population) and estimates based on enrollees assigned to plans (randomized population).

Setting: natural experiment in which more than 2⁄3rds of a state’s Medicaid population in 1 region was randomly assigned to 1 of 5 plans.

Participants: 137,933 enrollees in 2013–2014 in Louisiana, of whom 31.1% selected a plan and 68.9% were randomly assigned to 1 of the same 5 plans.

Measurements: Annual total spending (that is, payments to providers), primary care use, dental care use, and avoidable emergency department visits, all scored as plan-specific deviations from the “average” plan performance within each population.

Results: Enrollee characteristics were appreciably imbalanced across plans in the observational population, as expected, but were not in the randomized population. Annual total spending varied across plans more in the observational population (SD, $199.31$1472013 per enrollee) than in the randomized population (SD, $94.91$702013 per enrollee) after accounting for baseline differences in the observational and randomized populations and for differences across plans.

On average, a plan’s spending score (its deviation from the “average” performance) in the observational population differed from its score in the randomized population by $90.84$672013 per enrollee in absolute value (95% CI, $51.52$382013 to $166.77$1232013), or 4.2% of mean spending per enrollee (p = 0.009, rejecting the null hypothesis that this difference would be expected from sampling error).

The difference was reduced modestly by risk adjustment to $84.06$622013 per enrollee (p = 0.012). Residual confounding was similarly substantial for most other performance measures. Further adjustment for social factors did not materially change estimates…Despite a high patient-level R2 of 29% for health care spending, indicating that the enrollee variables included in our risk-adjustment approach captured more than a quarter of the variation in the outcome, risk adjustment did not meaningfully reduce confounding at the plan level for spending in our study.

Figure 1: Differences in plan total health care spending scores derived from the observational and randomized populations. Each bar corresponds to 1 of the 5 plans. The blue area of the bar corresponds to a plan’s randomized spending score (relative to the “average” plan mean) based on the randomly assigned population. The orange bar corresponds to a plan’s spending score based on the observational population before risk adjustment. The grey unhatched portion indicates the difference between the 2 scores, or the extent of residual confounding in the observational scores. For these 5 Medicaid plans, higher-cost enrollees selected plans that control spending to a lesser extent. We calculated a plan score for each plan equal to the plan’s deviation from the population-specific plan mean. We compared plan scores between the 2 populations, instead of raw plan means, because population means differed somewhat. Thus, we compared how a plan performed relative to other plans in 1 population with its relative performance in the other population.

Limitation: Potential heterogeneity in plan effects between the 2 populations.

Conclusion: Residual confounding in risk-adjusted performance assessments can be substantial and should caution policymakers against assuming that risk adjustment isolates real differences in plan performance.

Primary Funding Source: Arnold Ventures.