Version: 2.2

Multiple Rates: Hospital Data

Background:

When mapping raw data to "rate objects", there may be multiple raw rates for the particular payer-provider-code. This is often due to different modifiers, contract methodologies or revenue codes.

This analysis explores:

frequencies of multiple modifiers, contract methodologies, or revenue codes
how often modifiers, contract methodologies, and revenue codes are NULL

Findings:

A single payer-provider-code is reported with different contract methodologies > 16% of the time.
For MS-DRG payer-provider-codes, there exists at least one rate with a NULL revenue code 97.6% of the time and at least one rate with a NULL modifier 99.9% of the time (as expected).
For HCPCS payer-provider-codes, there exists at least one rate with a NULL revenue code 61.3% of the time and at least one rate with a NULL modifier 98.5% of the time

In general, the NULL values likely represent the "base" case and are desirable, although there may be exceptions.

CLD Implications:

These results help to inform how the rate object should be defined and how much effort we want to put into carefully mapping raw rate values to rate objects. As examples:

Contract Methodology: Unless we have strong provisions coverage, we should not try to determine the "correct" contract methodology, since multiple rates are so common in this field. Further, there are few distinct values of contract methodology (case rate, percentage of total, fee schedule, other, per diem, NULL). So casting these values wide in the CLD (a column for each contract methodology) is feasible.
For MS-DRG, we can filter to rates where modifier and revenue code are NULL. We can investigate why 2.5% of payer-provider-codes don't have a NULL revenue code -- sometimes it's because they are "nursery" codes and sometimes the provider simply reports the same rate copied 20+ times for each revenue code from 100-159.
For HCPCS:
- modifiers: We can filter to rates where modifier is NULL in the general case, and filter to specific modifiers as needed.
- revenue codes: A payer-provider-code has 0 revenue codes 56% of the time and at most one revenue code 90% of the time. Further investigation would be helpful here -- perhaps we can prioritize surgery codes with 0360, 0361, 0369, 0481, 0490, 0499, 0750 and 0790.

Methodology

Randomly sample 100k rates from hospital data. Then group by payer, provider, code and compute:

The distinct counts of 'contract_methodology', 'billing_code_modifiers', 'revenue_code'
The count of payer-provider-code where there is a NULL value for 'contract_methodology', 'billing_code_modifiers', 'revenue_code'

Click to see code

# %%
from IPython import get_ipython

if get_ipython() is not None:
    get_ipython().run_line_magic("load_ext", "autoreload")
    get_ipython().run_line_magic("autoreload", "2")

import pandas as pd
from jinja2 import Template
from tqds.sql import db, sql

pd.options.display.max_columns = None
pd.options.display.max_rows = 500
trino_conn = db.TrinoClient().conn
redshift_conn = db.RedshiftClient().conn
query = sql.Query(TEMPLATE_DIR="../sql")

# %%
def get_sample(_type='MS-DRG'):

    if _type == "MS-DRG":
        _filter = "WHERE billing_code_type = 'MS-DRG'"
    elif _type == "HCPCS":
        _filter = """
        WHERE billing_code_type = 'HCPCS'
        """

    q = """
    WITH 
    df AS (
        SELECT 
            provider_id
            , payer_id
            , billing_code_type
            , billing_code
            {% for column in ['billing_class', 'setting', 'contract_methodology', 'billing_code_modifiers', 'revenue_code'] %}
            , COUNT(DISTINCT NULLIF({{ column }}, '')) AS n_{{ column }}
            {% endfor %}

            {% for column in ['contract_methodology', 'billing_code_modifiers', 'revenue_code'] %}
            , SUM(CASE WHEN NULLIF({{ column }}, '') IS NULL THEN 1 ELSE 0 END)  AS n_null_{{ column }}
            {% endfor %}
        FROM glue.hospital_data.hospital_rates
        {{_filter}}
        AND payer_class_name = 'Commercial'
        AND (
            negotiated_dollar > 0
            OR negotiated_percentage > 0
        )
        GROUP BY 1,2,3,4
    )
    SELECT *
    FROM df 
    ORDER BY random()
    LIMIT 100000
    """

    query = Template(q).render(_filter=_filter)
    df = pd.read_sql(query, con=trino_conn)
    return df


# %%
distinct_cols = [
    "n_contract_methodology",
    "n_billing_code_modifiers",
    "n_revenue_code",
]

null_cols = [
    "n_null_contract_methodology",
    "n_null_billing_code_modifiers",
    "n_null_revenue_code",
]

# %%
df = get_sample(_type="MS-DRG")
for col in distinct_cols:
    tmp = df[col].astype(str).copy()
    tmp[tmp.astype(int) > 3] = "4+"
    table = tmp.value_counts() / len(tmp)
    print(table.to_markdown())

for col in null_cols:
    tmp = df[col].astype(str).copy()
    tmp[tmp.astype(int) > 0] = "1+"
    table = tmp.value_counts() / len(tmp)
    print(table.to_markdown())

# %%
df = get_sample(_type="HCPCS")
for col in distinct_cols:
    tmp = df[col].astype(str).copy()
    tmp[tmp.astype(int) > 3] = "4+"
    table = tmp.value_counts() / len(tmp)
    print(table.to_markdown())

for col in null_cols:
    tmp = df[col].astype(str).copy()
    tmp[tmp.astype(int) > 0] = "1+"
    table = tmp.value_counts() / len(tmp)
    print(table.to_markdown())

MS-DRG

Distinct Counts

16% of the time, there are multiple contract methodologies

n_contract_methodology	count
1	0.78608
2	0.12123
0	0.0529
3	0.03009
4+	0.0097

we'd expect n_billing_code_modifiers=0 to be 100%, so the non-zeros are likely errors in hospital data

n_billing_code_modifiers	count
0	0.99857
1	0.00141
2	1e-05
3	1e-05

n_revenue_code	count
0	0.97635
4+	0.01987
1	0.00359
3	0.00016
2	3e-05

Count of NULLs

0 indicates the count of codes where there are NO rates with a NULL value in that column

(e.g. 94.5% of payer-provider-codes have NO rates with a NULL contract_methodology)

n_null_contract_methodology	count
0	0.94537
1+	0.05463

n_null_billing_code_modifiers	count
1+	0.99869
0	0.00131

n_null_revenue_code	count
1+	0.98859
0	0.01141

HCPCS

Distinct Counts

21% of the time, there are multiple contract methodologies

n_contract_methodology	count
1	0.79152
2	0.1229
0	0.072
3	0.01255
4+	0.00103

n_billing_code_modifiers	count
0	0.95787
1	0.02807
2	0.00918
3	0.00258
4+	0.0023

n_revenue_code	count
0	0.56202
1	0.34308
4+	0.05288
2	0.03422
3	0.0078

Count of NULLs

n_null_contract_methodology	count
0	0.92447
1+	0.07553

n_null_billing_code_modifiers	count
1+	0.98466
0	0.01534

NULLs are not as frequent with HCPCS revenue codes. There are ~39% of provider-payer-codes with NO rates that have a NULL revenue code.

n_null_revenue_code	count
1+	0.61308
0	0.38692

Methodology​

MS-DRG​

Distinct Counts​

Count of NULLs​

HCPCS​

Distinct Counts​

Count of NULLs​

On this page:

Methodology

MS-DRG

Distinct Counts

Count of NULLs

HCPCS

Distinct Counts

Count of NULLs