Transformations
Raw rates arrive in different units and can't be compared directly. Transformations convert them to a common unit — dollars per service — while preserving all original values.
Overview
Why Transform?
The comparability problem:
- Hospital A: "125% of gross charges" for code 99213 — meaningless without knowing the gross charge
- Hospital B: "$2,500 per day" for MS-DRG 470 — can't compare to a case rate without length of stay
- Hospital C: "$500 per 40mg vial" for drug J0135 — different dosage unit than what the payer's rate covers
- Payer: "$200 fee schedule rate" — already in dollars, no transformation needed
Transformations normalize the first three to dollars. The payer's $200 passes through unchanged.
Raw rate types that require transformation
| Rate type | Example | What's needed | Transformation |
|---|---|---|---|
| % of billed charges | Hospital reports 130% for CPT 27447 | Gross charge for that code at that provider | 0.01 × pct × gross_charge — computed for all 6 GC sources (36 output columns) |
| Per diem | $2,500/day for MS-DRG 470 | Geometric mean LOS (GLOS) from CMS | GLOS is attached as a reference column; per diem is kept as-is for accuracy scoring to compare |
| Drug dosage | $500 per 40mg vial of J0135 | Standard ASP dosage quantity for the HCPCS code | (rate / source_quantity) × asp_quantity — three methods in order of preference |
| Anesthesia | Payer reports rate per base unit or per case | Anesthesia base units for the code; payer-specific convention | Payer-specific formula — some divide by base_units, some use rate as-is, one multiplies by 15 |
Additive, not destructive
Transformations add new columns to the combined raw table — they do not overwrite or drop originals. The output table tmp_int_transformations_{sub_version} contains all columns from tmp_int_combined_raw plus transformation-specific columns:
- 36 pct-to-dollar columns (6 rate types × 6 gross charge sources)
- GLOS / ALOS reference columns for per diem codes
- Drug dosage standardized columns (3 methods × dollar + percentage variants × per
contract_methodology) - Anesthesia conversion columns per
negotiated_typeplus aconversion_methodstring
Most rates don't need any transformation — dollar-denominated rates pass through unchanged, with all transformation columns NULL for that ROID. NULL transformation column ≠ missing rate; it usually means the raw rate is already usable as-is.
Example: percentage rate before and after
In tmp_int_combined_raw:
payer_id=42 | provider_id=P123 | code=27447 | bill_type=Outpatient
hospital_pct_of_total_billed_charges_pct=130 | hospital_fee_schedule_dollar=NULL
mrf_gross_charge_provider=18000 | mrf_gross_charge_cbsa_median=17500
In tmp_int_transformations (added columns):
hospital_perc_of_total_billed_charges_gc_hosp_perc_to_dol = 0.01 × 130 × 18000 = $23,400
hospital_perc_of_total_billed_charges_gc_hosp_cbsa_perc_to_dol = 0.01 × 130 × 17500 = $22,750
glos = NULL (not a DRG) | drug_dosage_std = NULL (not a drug)
The original 130% and $18,000 gross charge are preserved. Two new dollar columns are added — one per gross charge source. Accuracy scoring will prefer the provider-level transform over the CBSA median.
Per-Diem to Case Rate
Convert daily rates to case-level using CMS geometric mean length of stay (GLOS).
Walk-through: per-diem transformation
- Input: Hospital reports per-diem rate of $2,500/day for MS-DRG 470 (Knee Replacement)
- GLOS lookup: CMS MS-DRG weights table says DRG 470 has GLOS = 2.4 days
- Transform: Per-diem rates aren't multiplied — they're carried through as-is, but the GLOS is used in accuracy scoring
- Comparison: When evaluating this rate against a case rate of 2,500 vs 2,500
Per-diem rates are kept in their native form but evaluated against case_rate / GLOS during accuracy scoring. This avoids compounding errors from multiplying rate × GLOS.
APR-DRG GLOS: For APR-DRG codes, the GLOS is the average of mapped MS-DRG GLOS values from the crosswalk table.
Percentage to Dollar
Convert percentage rates to dollar amounts using gross charges. The largest transformation: 6 GC sources × 6 rate types = 36 output columns.
Walk-through: pct-to-dollar
- Input: Hospital reports "125% of gross charges" for code 99213
- Gross charge lookup: Provider-level MRF gross charge = $160
- Formula: 0.01 × 125 × 200**
- This transform is computed for ALL 6 gross charge sources:
payer_gc_hosp_perc_to_dol = 0.01 × 125 × $160 = $200 (provider MRF)
payer_gc_hosp_cbsa_perc_to_dol = 0.01 × 125 × $155 = $193 (CBSA median)
payer_gc_hosp_state_perc_to_dol = 0.01 × 125 × $148 = $185 (state median)
payer_gc_komodo_perc_to_dol = 0.01 × 125 × $162 = $202 (Komodo provider)
payer_gc_komodo_cbsa_perc_to_dol= 0.01 × 125 × $150 = $187 (Komodo CBSA)
payer_gc_komodo_state_perc_to_dol= 0.01 × 125 × $145 = $181 (Komodo state)
Accuracy scoring impact: Each of these 36 columns gets its own accuracy score. Provider-level GC transforms (gc_hosp) score higher than geographic medians (gc_hosp_cbsa, gc_hosp_state) because they're more specific. This is why accuracy drives rate selection — it automatically prefers the best gross charge source.
Drug Dosage Standardization
Normalize drug rates to standard dosage units per HCPCS drug code using ASP (Average Sales Price) reference.
Walk-through: drug dosage
- Problem: Hospital reports "$500 per vial" for J0135 (Adalimumab), but the vial contains 40mg while ASP prices per 20mg
- Method 1 (Parsed Quantity): Parse "40 MG" from drug description → (250
- Method 2 (NDC-based): NDC crosswalk says this vial has strength 40mg → (250
- Method 3 (Drug Unit):
drug_unit_of_measurement = 40→ (250
All 3 methods converge on $250 per standard ASP dosage unit. Best method selected: prefer Method 1 > 2 > 3.
ASP validation: Drug rates outside 200-2200% of ASP payment limit are flagged as outliers in accuracy scoring. This catches unit-of-measure errors (e.g., pricing per mg instead of per vial).
DRG Crosswalk
Convert APR-DRG to MS-DRG (federal standard) for national comparability.
Walk-through: APR-DRG → MS-DRG
- Input: NY hospital reports rates for APR-DRG 302 (SOI 1-4: 8K, 20K)
- Crosswalk: APR-DRG 302 maps to MS-DRG 469/470 (Major Joint Replacement)
- Average: (8K + 20K) / 4 = $11,250
- Original preserved:
original_billing_codes = ['302-1','302-2','302-3','302-4']
Crosswalked rate: MS-DRG 470 = $11,250 (average across SOI levels). Known bias: 3-15% higher than native MS-DRG due to SOI averaging.
Known bias: Averaging across severity levels inflates the crosswalked rate by 3-15% compared to native MS-DRG rates. This is accepted because APR-DRG data would otherwise be unusable for national comparison.
Anesthesia Conversion
Anesthesia payer rates can't be compared directly — different payers report them in different units. The pipeline applies payer-specific formulas to convert them to dollars per case.
The Problem
Anesthesia services (HCPCS codes like 00840, 01402) are priced using "base units" — a code-specific weight published by CMS. Some payers report:
- A rate per base unit — e.g., 640
- A rate per total units (base + time) — e.g., 80
- A flat case rate — e.g., $640 total (no unit math needed)
Without knowing which convention a payer uses, the raw rate is uninterpretable. The transformation produces a dollar-per-case estimate using the payer's known convention and the code's anesthesia base units from ANESTHESIA_BASE_UNITS.
Payer-Specific Formulas
| Payer IDs | Formula | Convention |
|---|---|---|
| 47, 391, 643, 567 | rate (as-is) | Payer reports a flat case rate — pass through unchanged |
| 76 (code 99100 only) | rate (as-is) | Special case for qualifying circumstances code |
| 44 | rate × 15 | Payer reports rate per minute; 1 base unit = 15 minutes |
| 7, 42, 160, 389, 392, 229, 53, 54, 403, 388, 461, 111, 61, 56, 51, 390 | rate / (base_units + 1) | Payer reports rate per total units (base + 1 time unit) |
| 299 | rate / (base_units + 15) | Payer bundles 15 extra time units into the reported rate |
| 43, 101, 168, 169, 628, 638, 49, 76 | rate / base_units | Payer reports rate per base unit; divide to normalize to per-case |
Only applies to payer MRF rates. Anesthesia conversion is computed for payer_negotiated_rate, payer_derived_rate, and payer_fee_schedule_rate only. Hospital MRF rates are not converted. If the ROID has no anesthesia base units entry, all conversion columns are NULL.
Output Columns
For each of the three payer negotiated types, two columns are added to tmp_int_transformations:
payer_{type}_rate_anesthesia_cf— the converted dollar amountpayer_{type}_anesthesia_conversion_method— a human-readable string of the formula applied (e.g.,'{rate} / ({anesthesia_base_units} + 1)')
Plus anesthesia_base_units — the base unit count for the code, for traceability.