Introduction
Benefit richness is an important measure to many stakeholders in commercial health insurance, including but not limited to:
- insurers, when establishing premium rates,
- plan purchasers, particularly self-funded employers, to evaluate differences between plans, and
- choice assisters, such as brokers, so they can provide the best recommendations to purchasers.
Insurance issuers must analyze benefit richness carefully in order to price their plans appropriately. Issuers that misestimate the richness of their plan designs may experience unprofitability on certain plans (i.e., when plans are priced too low) and poor sales or limited enrollment on others (i.e., when plans are priced too high). Other stakeholders such as insurance brokers should also evaluate benefit richness to make well-informed recommendations for their clients on plan selection to maximize plan value. Even plan purchasers, such as self-funded employers and individual policyholders, consider benefit richness when deciding what plan options are right for their employees (and themselves).
However, the pricing and valuation of plan designs can be tricky. Actuaries must evaluate how various attributes affect the richness of a plan, including differences in:
- covered services,
- cost sharing (such as deductibles, coinsurances, out-of-pocket maximums, and copays),
- the cost and utilization of services by area and demographic mix, and
- member behavior in response to cost sharing.
Milliman’s Benefit Plan Evaluation model (called Milliman CORAL) is specifically designed to assist carriers with making important decisions about plan design pricing and valuation. Based on the research described in this report, CORAL is a stronger predictor of benefit richness and a closer match to actual experience data compared to the Federal Actuarial Value Calculator (Federal AV Calculator), which some organizations use for pricing (despite federal regulators’ warnings that the Federal AV Calculator is not a pricing tool). The Federal AV Calculator is widely familiar to all ACA issuers. It is used to analyze relative differences in benefit richness due to cost-sharing changes, and each metal tier actuarial value is measured using base period membership in that metal tier.
We believe it is important for organizations to use the right tools for their use cases, whether that is pricing or compliance. Thus, we conducted this analysis to demonstrate how much more effective a tool like Milliman CORAL can be for pricing, compared to alternatives. As described in this report, across a number of statistical tests (paired Student’s t-test, mean absolute error test, and root mean squared error test), the Milliman CORAL model performed better relative to actual experience data when compared to the Federal Actuarial Value Calculator (Federal AV Calculator).
Background
Actuarial value (AV) measures the percentage of healthcare costs expected to be covered by the health plan as a proportion of total allowed costs. For example, a plan with a 70% AV is intended to pay for, on average, 70% of an individual’s healthcare needs under the plan. The remainder is cost sharing to be paid by the member - in this example, 30% of costs. The term actuarial value typically refers to a measurement of expected costs. While measuring actual claims experience, this same measurement is typically referred to as paid/allowed ratio, although actuarial value is occasionally used in this context as well.
This analysis focuses on the individual Affordable Care Act (ACA) market. Plans in the individual market are required to meet standardized benefit richness tiers, as measured by the Federal AV Calculator. The Federal AV calculator estimates actuarial value for each metal tier using projected costs for a standardized population based on a previous year’s individual and small group enrollment. We evaluated theoretical actuarial values produced by the Federal AV Calculator and also by Milliman’s CORAL model relative to actual experience data as included in public use files described in additional detail below.
Models assessed
We utilized the following benefit richness tools in our study:
- Federal AV Calculator: The Center for Consumer Information and Insurance Oversight (CCIIO) publishes the AV calculator, an Excel-based tool that uses a continuance table for each metallic level based upon a nationwide standard population to derive metallic level-specific AVs for the commercial market. As stated in the methodology of the tool, “the AV Calculator represents an empirical estimate of the AV calculated in a manner that provides a close approximation to the actual average spending by a wide range of consumers in a standard population.”1 The primary purposes of the tool are to ensure compliance with ACA standards and categorize plans into discrete metallic tiers (e.g., bronze, silver, gold, and platinum). We acknowledge the Federal AV Calculator does not describe itself as a pricing tool; however, this analysis still has merit because (a) some carriers utilize the Federal AV Calculator as a basis for pricing AVs, and (b) the Federal AV Calculator is free, widely available, familiar to all carriers operating in the ACA marketplace, and purports to measure actuarial value based on an ACA population.
- Milliman’s CORAL Benefit Plan Evaluation (CORAL): Milliman’s CORAL model exists on a web-based interface and uses Milliman’s Health Cost Guidelines™ (HCGs) to develop plan relativities comprised of not only actuarial value but also induced utilization (i.e., how a member’s behavior may change in response to the cost-sharing provisions applied to a given service category). The model allows for a more granular calibration of general assumptions (e.g., geographic area, demographic mix, hospital contract discounts, degree of healthcare management), medical benefits, and pharmacy benefits. Additionally, issuers have the flexibility to enter plan benefits using 33 distinct medical service categories and up to six tiers of prescription drug benefits. CORAL uses these additional inputs to calibrate claims probability distributions more precisely for each plan. Further, CORAL models the interactions between benefits, allowing for a more nuanced calculation.
Study procedure
Data
We leveraged the 2022 Health Insurance Exchange Benefits and Cost Sharing and Plan Attributes Public Use Files2 published by the Centers for Medicare and Medicaid Services (CMS) for this analysis. Specifically, we used these files to capture the detailed benefit information for individual ACA plans offered during plan year 2022. We then ran these plans through CORAL in order to produce CORAL AVs.
We additionally leveraged the 2022 and 2024 Rate Review Public Use Files for Single Risk Pool Plans,3 also published by CMS. We used the 2022 file to capture federal AVs (indicated as AV_METAL in the file), which issuers are required to report in the Unified Rate Review Template (URRT). We used the 2024 file to capture actual 2022 paid/allowed ratios for individual ACA plans offered during plan year 2022 (calculated as EXP_INC_CLM minus EXP_REINS divided by EXP_TAC, as indicated in the file).
We then limited our sample to only include plans with at least 3,000 member months in 2022 to ensure sufficient credibility of the resulting actual paid/allowed ratios. This resulted in a sample of approximately 2,500 plans nationwide covering all metallic tiers.
Modeling notes
The two models in our analysis, the Federal AV Calculator and CORAL, each produce a benefit slope that describes the relative generosity for plans as measured by plan AVs. Actuarial models used to project AVs for use in pricing, such as CORAL, are typically calibrated to adjust the sloping to better fit to the distinct demographic, utilization, and charge characteristics of the pricing issuer. The Federal AV Calculator does not allow for calibration beyond that used for the standardized populations, whereas CORAL allows for this calibration. For this analysis, we calibrated CORAL model’s underlying nationwide data for each plan modeled to reflect the state in which the plan was filed. This calibration, which can be viewed as a reasonable basis for a new manually rated plan, can thus be more in line with the final paid/allowed ratios than the Federal AV Calculator.
Note, actual paid/allowed ratios can vary significantly based on the actual age distribution of members relative to pricing on a particular benefit plan. This case study does not adjust for this correlation, nor for various other items (i.e., health status, area distribution, etc.) that may affect a specific plan’s actual health care utilization and costs. Rather, this analysis compares AVs from CORAL with statewide demographics and the Federal AV Calculator, compared to actual paid/allowed ratios as reported.
Statistical testing
We sought to run a paired t-test to evaluate whether CORAL is a better predictor of benefit richness than the Federal AV Calculator, and if any difference was statistically significant. A paired t-test requires the differences between paired values to be approximately normally distributed. Therefore, we first tested the normality of the distribution of differences as follows:
1. First, we calculated the mean squared error (MSE) for each of the roughly 2,500 plans to evaluate the accuracy of the Federal AV Calculator and CORAL relative to actual paid/allowed ratios. Where i refers to a given plan, the formula for MSE is as follows:
We then calculated the difference in MSE (i.e., Federal AV Calculator minus CORAL) for each plan.
2. We plotted the results using a histogram (shown in Figure 1), which appears visually to be approximately normally distributed (i.e., follows a bell curve shape). The skewness of the distribution is -0.33 and the kurtosis is 4.06, which suggests a slightly negative skew and a sharper peak with heavier tails compared to a normal distribution. These values underscore that the distribution is approximately normally distributed.
Figure 1: Histogram of differences in Federal AV Calculator and Milliman CORAL actuarial value squared error (n = 2,480)
3. To evaluate the normality of the distribution using another technique, we set up a Q-Q plot of the differences in mean squared error against data points produced by a standard normal distribution, (shown in Figure 2). The modeling results in a 93.4% R-squared value, suggesting that the distribution of differences in MSE is approximately normal.
Figure 2: Q-Q plot of differences in Federal AV Calculator and Milliman CORAL standard error vs. standard normal distribution (n = 2,480)
Results
After completing this initial testing, we could then perform the paired t-tests. The results of the test are shown in Figure 3. Using a significance level of 0.05 (which is commonly used), CORAL demonstrated statistically significant improvement over the Federal AV Calculator across all metallic tiers combined and individually by metallic tier. We believe these results viewed in total and by metallic tier demonstrate a compelling argument for using designated pricing models (such as CORAL) to develop pricing actuarial values.
Figure 3: Results of paired t-tests
| Metric | Total | Bronze | Silver | Gold | 
|---|---|---|---|---|
| n | 2,480 | 1,017 | 997 | 419 | 
| Mean | 0.0039 | -0.0017 | 0.0121 | -0.0016 | 
| Std. Dev. | 0.0172 | 0.0162 | 0.0177 | 0.0085 | 
| t-statistic | 11.1839 | -3.2771 | 21.6718 | -3.8539 | 
| p-value | 0.0000 | 0.0011 | 0.0000 | 0.0001 | 
| Results | Significant | Significant | Significant | Significant | 
| Statistically significant improvement (p-value < 0.05) for 98.1% of plans | ||||
Note: Our sample only contained 34 catastrophic and 13 platinum plans. Therefore, we exclude catastrophic and platinum from our test due to credibility considerations.
To further validate this result, we additionally evaluated the mean absolute error (MAE) and root mean squared error (RMSE).
MAE is a measure of the average of the difference between the modeled AV and actual paid/allowed ratio results without regard for the direction of the difference. The formula is as follows:
RMSE, on the other hand, measures the square root of the average of the square of the difference between the modeled AV and actual paid/allowed ratio results. The formula is as follows:
This is a metric similar to MAE but it further captures the magnitude of the difference quadratically (due to the squaring) rather than linearly for MAE. This puts more weight on outliers. The results of these tests are summarized in Figure 4.
Figure 4: Results of additional testing
| Metric | AV Calculator | CORAL | 
|---|---|---|
| MAE | 0.1207 | 0.1054 | 
| RMSE | 0.1499 | 0.1365 | 
The MAE and RMSE indicates an improvement in the accuracy of the prediction using CORAL compared to using the Federal AV Calculator, as the lower MAE and RSME values suggest the CORAL model produced predicted values closer to the actual paid/allowed ratio results.
The results of the paired t-tests and corresponding residual testing demonstrated that CORAL is a stronger predictor of benefit richness, with the predicted values being closer to the actual paid/allowed ratio results, based on a sample of approximately 2,500 individual ACA plans.
Takeaways
Accurately measuring benefit richness may be crucial to establishing a well-positioned, appropriately priced plan portfolio. The Federal AV Calculator, while required as a compliance tool, is not required to be the sole source for developing AVs to be used directly in pricing premium rates. Our study demonstrates CORAL is a statistically significantly stronger predictor of actual benefit richness than the Federal AV Calculator and could be a valuable weapon in the arsenal for any commercial health insurance stakeholder interested in a better gauge of plan generosity, from health insurers to plan purchasers and benefit advisors.
CORAL is leveraged across all commercial insurance markets, not only the individual ACA market. To learn more about how your organization can leverage CORAL, visit the Milliman CORAL landing page for more information.
Caveats and Limitations
This information in this article is intended to provide readers with a case study on the comparative performance of CORAL and the Federal AV Calculator for determining benefit richness for commercial ACA pricing purposes. All estimates in this article are purely illustrative and are not intended to represent any information proprietary to any organization. Analyses on other plan experience may produce different results. This information may not be appropriate and should not be used for any other purposes.
Milliman has developed certain models to estimate the values included in this letter. The intent of the models was to estimate benefit richness for commercial health plans. We have reviewed the models, including their inputs, calculations, and outputs, for consistency, reasonableness, and appropriateness to the intended purpose and in compliance with generally accepted actuarial practice and relevant actuarial standards of practice (ASOPs). The models, including all input, calculations, and output, may not be appropriate for any other purpose.
Guidelines issued by the American Academy of Actuaries require actuaries to include their professional qualifications in all actuarial communications. Barbara R. Collier and Evan R. Pollock are members of the American Academy of Actuaries and meet the qualification standards for performing the analyses in this report.
All opinions expressed in this article are strictly the opinions of the authors. Milliman is an independent firm and provides unbiased research and analysis on behalf of many clients. Milliman does not take any specific position on matters of public policy.
1 CMS (April 2, 2024). RE: Final 2025 Actuarial Value Calculator Methodology. Retrieved January 27, 2025, from https://www.cms.gov/files/document/final-2025-av-calculator-methodology.pdf.
2 CMS. Health Insurance Exchange Public Use Files (Exchange PUFs). Retrieved January 27, 2025, from https://www.cms.gov/marketplace/resources/data/public-use-files.
3 CMS. Rate Review Data. Retrieved January 27, 2025, from https://www.cms.gov/marketplace/resources/data/rate-review-data.