Version: 2.1

Scope Exploration

Finding: 92% of provider-payer-codes have either single rates, a single plan, or plans with "PPO", "ALL PRODUCTS", "ALL COMMERCIAL", "ALL" or something similar.

Results

Contingency Table 1

The contingency table below shows the frequency counts, expressed as a percentage, of "Is Single Rate OR Has Low Rate Variability" and "Is Single Plan OR has PPO / 'ALL COMMERCIAL' plan".

	Is Single Plan OR Has PPO / "ALL COMMERCIAL" plan
Is Single Rate OR Has Low Rate Variability	False	True	All
False	0.08439	0.1541	0.23849
True	0.11394	0.64757	0.76151
All	0.19833	0.80167	1

If a plan name contains "PPO", "ALL PRODUCTS", "ALL COMMERCIAL" or something similar, we know which rate to use, so we include rates that pass this filter in the "True" column. There are 80% of rates that have either a single plan or a PPO / "ALL COMMERCIAL" plan.

So only 8% of plans have multiple rates AND multiple non-PPO/ALL-COMMERCIAL plans.

We see that 76% of codes only have a single rate. This is fairly close to Avery's result for Cigna OAP (61% in Avery's analysis).

Contingency Table 2: Remove Low Rate Var

Removing the "low rate variability" filter increases the False/False percentage from 8% to 13%.

	Is Single Plan OR Has PPO / "ALL COMMERCIAL" plan
Is Single Rate	False	True	All
False	0.13432	0.22142	0.35574
True	0.05721	0.58705	0.64426
All	0.19153	0.80847	1

Methodology

This query generates a random sample of 100k provider-payer-code combinations (across all plans, all hospitals, and all HCPCS/MS-DRG codes) with indicators for whether the provider-payer-code combination has:

multiple rates
multiple plans
PPO or "ALL" plans
low rate variability

Definitions:

"Multiple rates" is True if within a provider-payer-code, any combination of revenue_code, billing_code_modifiers, contract_methodology, billing_class, setting has multiple rates.
"Multiple plans" is True if a provider-plan-code has more than one unique plan_name.
PPO or "ALL" plans is True if plan_name (with '-','_', and ' ' removed) contains one of following regex matches: ALLCOMMERCIAL, ALLPLAN, ALLPRODUCT, ALLPAY, ALLCHOICEPLANS, ^ALL$, ^CIGNAALL$, ^AETNAALL$
low rate variability is defined as*:

$\frac{\max(\text{rate}) - \min(\text{rate})}{\text{avg}(\text{rate})} < 20\%$

*Note: min/max can capture extreme values but that's ok because the Ns here are often quite small

Click to see SQL

WITH

-- get all hospital dollar and percentage rates
-- filter to Commercial HCPCS and MSDRGs
dollar_percentage_rates_cast_long AS (
    SELECT 
        *, 
        'dollar' as rate_type, 
        negotiated_dollar as rate
    FROM glue.hospital_data.hospital_rates
    WHERE negotiated_dollar > 0
    AND payer_class_name = 'Commercial' 
    AND billing_code_type IN ('HCPCS', 'MS-DRG')
    UNION ALL
    SELECT 
        *, 
        'percentage' as rate_type, 
        negotiated_percentage as rate
    FROM glue.hospital_data.hospital_rates
    WHERE negotiated_percentage > 0
    AND payer_class_name = 'Commercial' 
    AND billing_code_type IN ('HCPCS', 'MS-DRG')
),

-- how often are multiple rates issues caused by multiple plans
-- how many distinct plans are there? what is the rate spread?
evaluate_for_multiple_rates_due_to_plans AS (
    SELECT 
        provider_id,
        payer_id,
        revenue_code,
        billing_code_modifiers,
        contract_methodology,
        billing_class,
        setting,
        billing_code_type,
        billing_code,
        rate_type,
        count(distinct rate) as unique_rates,
        (max(rate) - min(rate)) / avg(rate) as rate_spread,
        array_distinct(array_agg(plan_name)) as unique_plans,
        array_distinct(array_agg(raw_payer_name)) as unique_payers
    FROM dollar_percentage_rates_cast_long
    GROUP BY 1,2,3,4,5,6,7,8,9,10
),

-- how many different rate representations are there (i.e. rev code, modifier, 
-- methodology, class, setting, payer, provider, code combinations)?
-- how many of these have multiple rates due to multiple plans?

provider_payer_code_rollup AS (
    SELECT 
        provider_id,
        payer_id,
        billing_code,
        array_distinct(flatten(ARRAY_AGG(unique_plans))) as unique_plans,
        cardinality(array_distinct(flatten(ARRAY_AGG(unique_plans)))) as n_unique_plans,
        cardinality(array_distinct(flatten(ARRAY_AGG(unique_payers)))) as n_unique_payers,
        count(*) as total_rates,
        sum(case when unique_rates = 1 then 1 else 0 end) as single_rates,
        approx_percentile(rate_spread, 0.5) as median_rate_spread
    FROM evaluate_for_multiple_rates_due_to_plans
    GROUP BY 1,2,3
)

-- create flags for whether payer-provider-code combinations have multiple rates 
-- and/or multiple rates
-- create flag to identify PPO or "ALL" plans
-- create flag for low rate variability
SELECT 
    *,
    single_rates = total_rates as is_single_rate,
    n_unique_plans = 1 as is_single_plan,
    any_match(
        unique_plans, 
        x -> regexp_like(
            REPLACE(REPLACE(x,'_',' '),' ',''), 
            'PPO|ALLCOMMERCIAL|ALLPLAN|ALLPRODUCT|ALLPAY|ALLCHOICEPLANS|^ALL$|^CIGNAALL$|^AETNAALL$'
        )
    ) as has_ppo_plan,
    any_match(
        unique_plans, 
        x -> regexp_like(
            REPLACE(REPLACE(x,'_',' '),' ',''), 
            'ALLCOMMERCIAL|ALLPLAN|ALLPRODUCT|ALLPAY|ALLCHOICEPLANS|^ALL$|^CIGNAALL$|^AETNAALL$'
        )
    ) as has_all_product_plan,
    median_rate_spread < 0.2 as low_rate_variability
FROM provider_payer_code_rollup
ORDER BY random()
LIMIT 100000

Appendix

Python example to get a contingency table

# crosstab 1
crosstab = pd.crosstab(
    df["is_single_rate"] | df["low_rate_variability"],
    df["is_single_plan"] | df["has_all_product_plan"],
    normalize="all",
    margins=True,
)
crosstab.index.name = ""
crosstab.columns.name = ""
print(crosstab.to_markdown())

Simple QA script to check work

sample = df.sample(1)
check = pd.read_sql(f"""
SELECT 
    provider_id,
    payer_id,
    revenue_code,
    billing_code_modifiers,
    contract_methodology,
    billing_class,
    setting,
    billing_code_type,
    billing_code,
    array_agg(distinct negotiated_dollar) as negotiated_dollars,
    array_agg(distinct negotiated_percentage) as negotiated_percentages,
    array_agg(distinct plan_name) as plan_names
FROM glue.hospital_data.hospital_rates
WHERE provider_id = '{sample.provider_id.values[0]}'
AND payer_id = {sample.payer_id.values[0]}
AND billing_code = '{sample.billing_code.values[0]}'
GROUP BY 1,2,3,4,5,6,7,8,9
""", con=trino_conn)
check

Results​

Contingency Table 1​

Contingency Table 2: Remove Low Rate Var​

Methodology​

Appendix​

On this page:

Results

Contingency Table 1

Contingency Table 2: Remove Low Rate Var

Methodology

Appendix