Skip to main content
Version: 2.1

2. Data Collection

1. Payer MRF

CLD pulls Payer MRF data from Core Rates and Core Rates Physician Groups.

a. Filter and Map Core Rates to Rate Object Space

We join Core Rates to the Rate Object Space. In order to do this, we need to create a few columns:

  • network_id is converted to at least one product_network_label, (the field used to identify networks in payer MRF Data)
  • provider_id is obtained using the CLD provider spine, which maps provider_id to NPIs; and Core Rates are filtered using these NPIS
    • for hospitals, the provider_id in CLD ROS is consistent with the provider_id in Core Rates, since they both use TQ's unified_provider_table to map hospital NPIs to provider_id
  • bill_type consists of ip, op, and pro. This field does not exist in Core Rates, but can be derived using just billing_class and billing_code_type
    • ip: billing_class is institutional and billing_code_type is one of MS-DRG or APR-DRG
    • op: billing_class is institutional and billing_code_type is one of HCPCS or APC
    • pro: billing_class is professional and billing_code_type is one of HCPCS or APC
  • billing_code_type: a couple billing code transformations occur when we pull Core Rates

b. Handle Multiple Rates

If there are multiple rates for a given payer, network, provider, code, bill_type and it is due to different negotiated_type, we keep all the rates. They will share the same rate object ID and each of these rates will be stored in its own column, to be evaluated separately for accuracy.

If there are multiple rates due to modifiers, we try to identify the GLOBAL rate. Typically, the global rate will be stored with a NULL array or as the longest array.

The example below illustrates the hierarchy used to select a single rate. Here, the "NULL" modifier would be preferred over the array containing TC, 26, 52, 53

billing_code_modifierhierarchy
NULL1
TC, 26, 52, 532
TC3
264

If there are NO global modifier rates identifiable, the max(rate) is selected. We believe the max would be most appropriate to identify the broad-network rate.

Physician Groups

When querying from core_rates_physician_groups, we select the rate with the highest provider_npi_count prior to selecting the max() rate.


2. Hospital MRF

a. Filter and Map Hospital Rates to the Rate Object Space (ROS)

We join Core Rates to the Rate Object Space. In order to do this, we need to create a few columns:

  • network_id is mapped to plan representations in the Hospital MRF data using the "plan bridge" (see in section below)
  • provider_id is consistent between CLD and Hospital Rates
  • bill_type consists of ip, op, and pro. This field does not exist in Hospital Rates, but can be derived using just billing_class and billing_code_type
    • ip: billing_class is institutional and billing_code_type is one of MS-DRG or APR-DRG
    • op: billing_class is institutional and billing_code_type is one of HCPCS or APC
    • pro: billing_class is professional and billing_code_type is one of HCPCS or APC
  • billing_code_type: a couple billing code transformations occur when we pull Hospital Rates

b. Plan Bridge

The plan_name field in hospital data is not standardized and cannot be reliably used to identify the rate's network.

To map hospital rates to networks, we build a "plan bridge":

  1. Scope:
    • The objective of the plan bridge is to identify the PPO or HMO network rates. We don't need to be able to distinguish various product names.
  2. Data Cleaning:
    • We filter out "plan_name" values that indicate networks that are clearly not PPO or HMO (such as "indemnity", "international", or government plans like Medicare/Medicaid)
    • The list of exclusions is available on this page. Search for "exclusions" and "payer_exclusions".
  3. Assumptions:
    • If there is exactly one plan_name (e.g. "CIGNA"), we assume that this is a "catch-all" or "global" rate, shared across all of the products between the payer and provider
    • If there is exactly one distinct rate, we assume that this is a "catch-all" or "global" rate, shared across all of the products between the payer and provider

c. Handle Multiple Rates

If there are multiple rates for a given payer, network, provider, code, bill_type and it is due to different contract_methodology, we keep all the rates. They will share the same rate object ID and each of these rates will be stored in its own column, to be evaluated separately for accuracy.

If there are multiple rates due to modifiers or revenue codes, we select the rate with a NULL modifier and/or NULL revenue code. This exploratory analysis shows the prevelance of these NULL rates. If there are NO rates with NULL modifiers or revenue codes, the max(rate) is selected. We believe the max would be most appropriate to identify the broad-network rate.

Modifiers:

We try to identify the GLOBAL rate. Typically, the global rate will be stored with a NULL array or as the longest array.

The example below illustrates the hierarchy used to select a single rate. Here, the "NULL" modifier would be preferred over the array containing TC, 26, 52, 53

billing_code_modifierhierarchy
NULL1
TC, 26, 52, 532
TC3
264

3. Gross Charges

Hospital MRF Gross Chages

The source table is hospital_rates, filtered to commercial payer class

Preprocessing

  • Codes are normalized (e.g., DRG codes are zero-padded, APR-DRG split-parts are standardized).
  • Bill type (Inpatient, Outpatient, Professional) is derived from billing code type and class.

Crosswalk for DRG Codes

  • A crosswalk table maps MS-DRG to APR-DRG and vice versa.
  • If a crosswalked code is not already present in the hospital rates, a synthetic row is added with the average charge for the original code.

Median Aggregations

  • Hospital gross charges are reduced to provider-level rows using the highest posted value.
  • Medians are calculated at:
    • CBSA level
    • State level

CCR Adjustments

When computing CBSA/state-level benchmarks, cost-report-based charges-relative-to-CBSA ratios can adjust gross charges:

  • CBSA-level: cbsa_ip_ratio_relativity
  • State-level: state_ip_ratio_relativity

You can find these values at tq_intermediate.cld_utils.cost_report_charge_relativities_2025_04_03.

From Kost on Slack re: why gross charges may be missing in hospital MRFs and CCR adjustments

On the MRF (we'll never know for sure, but)

  • For OP Surgery it gets tricky because of soft codes. I'd say typically OP surgery wouldn't have a charge rate, and if it does, it's probably based on some sort of claims estimate (which makes me think claim level). That could be a totally wrong assumption, though. There are probably ways to test this. If you took a MRF gross charge & compared it to the market benchmarks charges on a claim level vs line level, what does it come closer to?
  • For OP non surgery i'd guess it's line level if it's on the MRF. Labs, imaging, etc. I doubt they are reporting claim level cases since these are generally hard coded.
  • For IP MS-DRGs, it's claim level. In fact, hospitals don't typically have set charges for inpatient cases, so they likely won't have rates. If they do, it's almost certainly a claim based average.

On CBSA/State averages:

If we're using CBSA averages, we may want to think about incorporating medicare cost report factors. If the market is on average at $1000 for a charge, but the hospital is generally 20% above the market, we shouldn't just use $1000. I put this together last week (attached) that shows where hospitals are relative to their CBSA / State on charges.

link to thread


Komodo Claims-Based Gross Charges

  • Provider-level: tq_intermediate.claims_benchmarks.claims_benchmarks_gross_charges_npi
  • CBSA-level: tq_intermediate.claims_benchmarks.claims_benchmarks_gross_charges_cbsa
  • State-level: tq_intermediate.claims_benchmarks.claims_benchmarks_gross_charges_state

Each source provides:

  • Median billed units
  • Median gross charges at both line-item and claim level
  • Count of encounters

Aggregated Columns

For each level (provider, CBSA, state), the pipeline outputs:

  • Komodo gross charge (final selected value based on standardization/adjustment logic)
  • Count of encounters

CCR Adjustments

Same as above


Dosage Standardization for Drugs (Codes with ASP Payment Limits)

Dosage standardization ensures comparability of posted or claims-derived charges for drugs with varying package sizes or doses.

ASP Payment Limits

  • ASP payment limit: A benchmark unit price from CMS.
  • ASP lower/upper bounds: Defined as payment_limit × config multipliers (e.g., 200%–2200%).

Standardization Formula

For hospital-posted MRF rates: Standardized Charge=gross_chargeparsed_quantity×asp.quantity\text{Standardized Charge} = \frac{\text{gross\_charge}}{\text{parsed\_quantity}} \times \text{asp.quantity}

For Komodo claims-based rates: Standardized Line-Item Charge=gross_charge_line_itemmedian_billed_units×asp.quantity\text{Standardized Line-Item Charge} = \frac{\text{gross\_charge\_line\_item}}{\text{median\_billed\_units}} \times \text{asp.quantity}

Acceptance Criteria

  • The standardized value must fall between ASP lower and upper bounds.
  • If outside these bounds:
    • Hospital rates are set to NULL.
    • Komodo line-item rates are set to NULL.
    • For claims with claim-level charges, these are used only for inpatient or surgical codes.

Output Metadata

  • JSON objects record:
    • Raw gross charge
    • Parsed or median billed quantity
    • ASP quantity and unit label