Clear Rates Methodology
The Clear Rates pipeline turns raw payer and hospital machine-readable files into a single, scored, canonical rate per rate object. This section documents each stage of that pipeline — how rates are collected, scored, selected, and delivered.
Pipeline Stages
| Stage | What it does | SQL |
|---|---|---|
| Spines | Build canonical reference tables for payers, networks, providers, and codes | source |
| Rate Object Space | Compute the Cartesian product of valid (payer, provider, code, network) combinations | source |
| Raw Data | Ingest payer MRF, hospital MRF, claims, and gross charges; map onto the rate object space | source |
| Transformations | Standardize non-dollar rates (percentages, per diems, drug dosages) into comparable dollar amounts | source |
| Imputations | Estimate missing rates using a hierarchical fallback chain (RC global → RC HCPCS → carveouts → DRG → CSTM) | source |
| Accuracy | Score every rate 0–7 using outlier bounds, benchmark comparisons, and payer/hospital counterparty validation | source |
| Rate Selection | Pick the highest-scoring rate per ROID; translate to canonical score 1–5 | source |
| Output | Merge sub-version outputs into production tables; enrich with whispers and rollup views | source |
Key Concepts
Spines — curated reference tables that define the universe of payers, providers, networks, and codes. Spines standardize identifiers across data sources and determine which entities appear in the Rate Object Space. See Spines.
ROID — Rate Object ID. A unique combination of (payer_id, network_id, provider_id, billing_code, billing_code_type, bill_type). Every table in the pipeline is keyed on ROIDs.
Canonical — the final, selected values that represent Clear Rates' best answer for a ROID. canonical_rate is the single dollar rate chosen by the Rate Selection algorithm, and canonical_rate_score (1–5) is its confidence score. See Score Hierarchy for the full mapping.