Version: 3.0

Accuracy Scoring

Every rate column on every ROID row receives a numeric score reflecting how trustworthy it is. Rate selection in Stage 7 simply picks the highest-scored non-NULL value per ROID.

Overview

Why Accuracy Scores Exist

Scenario A — Validated: Payer MRF reports $150 for CPT 99213 at Provider X. Hospital MRF independently reports$ 145. Agreement within ~3%. → Both rates get score 7. Rate selection picks the payer rate.
Scenario B — Unvalidated: Payer MRF reports $150. No hospital MRF record for this ROID. Single-source rate within outlier bounds. → Score 4. Usable, but will lose to any validated rate.
Scenario C — Outlier: Payer MRF reports $5,000 for a routine office visit. Medicare MPFS benchmark is$ 120. This is a 40× multiple — almost certainly a data error. → Score 1. Rate selection deprioritizes this rate unless it's the only option.

Result

The score system lets rate selection automatically prefer the most trustworthy value — no manual curation required.

Two Rounds of Scoring

Raw Accuracy (chunked by payer) — Runs immediately after combined_raw is built. Scores all raw and transformed rate columns. Computes payer_rates_array and hospital_rates_array for counterparty checking. Output is also used to filter long rates for imputation (only non-outlier rates feed in). → tmp_int_accuracy_raw
BRIT Combination — Benchmarks + Raw + Imputations + Transformations are merged into one wide table per ROID. → tmp_int_combined_brit
BRIT Accuracy (by rate/provider type) — Second scoring pass after BRIT is built. Separate SQL and criteria for Medical, Drugs, Labs, PG, DME, Urgent Care — each type has different outlier bounds and validation rules. Runs in parallel. → tmp_int_accuracy_brit_medical, _drugs, _labs, _physician_groups, _dme, _urgent_care
BRIT Accuracy Union — All type-specific BRIT accuracy tables merged into the final accuracy table. → tmp_int_accuracy_brit

Score Scale (0–7) vs Canonical Rate Score (1–5)

These are two different scales — a common point of confusion:

Scale	Range	Where used	Purpose
Accuracy score	0–7 (with decimal tiebreaker)	Internal pipeline tables (`_validation_score` columns)	Determines which rate wins in rate selection
Canonical rate score	1–5	Output tables (`prod_combined_abridged`, `prod_combined_all`)	User-facing confidence indicator for the selected rate

The canonical rate score is a simplified translation of the winning rate's accuracy score — published so downstream users can filter by confidence without needing to understand the internal 0–7 system.

Outlier Bounds

Rates must fall within Medicare-anchored ranges. Different code types have different acceptable multiplier ranges.

Rate Type	Lower Bound	Upper Bound	Notes
Medical (Non-IP)	0.5× Medicare	30× Medicare	State percentile fallbacks when Medicare unavailable
Medical (Inpatient)	0.9× Medicare	10× Medicare	Stricter — IP rates cluster tighter around Medicare
Medical (HMO/Exchange IP)	0.5× Medicare	10× Medicare	Looser lower bound for managed care contracts
Drugs (Payer source)	0.8× Medicare	10× Medicare	DPR exception: up to 1000% allowed for payer rates
Drugs (Hospital source)	0.8× Medicare	4× Medicare	Hospital drug rates more constrained
Labs	0.2× Medicare	4.5× Medicare	Lab rates often very close to Medicare
DME	0.5× Medicare	5.5× Medicare
Physician Group	0.5× Medicare	5.5× Medicare	Anesthesia codes: $1-$ 200 hard cap

info

Why Medicare? Medicare rates (IPPS, OPPS, MPFS, ASP) are publicly available, actuarially grounded, and exist for nearly every billing code. They provide the anchor for "is this negotiated rate plausible?"

warning

Fallback bounds: When Medicare rate is unavailable for a code, state-level percentiles (30th and 70th) from the pre-computed outlier bounds table are used instead. These are computed from the distribution of all observed rates for that code.

Normal CDF Scoring

Uses the log-normal distribution to quantify how "common" a rate is relative to the observed distribution for that code.

Walk-through: CDF computation

Setup: Code 99213, pre-computed stats: median=4.8 (log-space), stddev=0.6
Rate A: $150 → ln($ 150)=5.01 → epsilon=0.05×4.8=0.24
CDF window: normal_cdf(4.8, 0.6, 5.01-0.24) - normal_cdf(4.8, 0.6, 5.01+0.24)
Result: CDF = 0.78 (78% of observed rates fall in a similar range → very common)
Rate B: $500 → ln($ 500)=6.21 → CDF = 0.05 (only 5% of rates are this high → unusual)

Result

Rate A gets score 6.78, Rate B gets score 6.05. Both are "raw, not outlier" (integer tier 6), but A is preferred as tiebreaker.

info

Drug exception: Drug codes (is_drug_code=true) get CDF=0, not distribution-based scoring. Drug rates are evaluated purely against ASP bounds because drug pricing distributions are highly bimodal (brand vs generic).

Counterparty Validation (Score = 7)

The highest confidence signal: payer and hospital independently report similar rates for the same ROID.

The Logic

For each payer rate column, check if it matches any hospital rate within ±20% (or ±10% for rates above $15,000). And vice versa for hospital rates.

Walk-through: counterparty validation

Setup: Code 99213, Provider X, Payer Y
Payer reports: $1,000 (negotiated rate)
Hospital reports: [ $950,$ 1,050, $2,000] (array from different methodology columns)
Check payer $1,000 vs hospital array:
- vs $950: |$ 1,000- $950| =$ 50 ≤ 20%× $1,000 ($ 200) → MATCH
Also check: Must pass outlier bounds (0.5-30× Medicare for non-IP) → $1,000 is within bounds

Result

Score = 7 + ($1,000 / 1e8) = 7.00001 (validated). The /1e8 tiebreaker means: among validated rates, higher dollar rates are slightly preferred.

info

Why ±20%? Payer and hospital MRF files are independently produced. Differences arise from rounding, per-day vs per-case structures, and reporting methodology variations. The 20% tolerance (10% above $15K) accounts for these structural differences while requiring fundamental agreement.

info

Symmetry: Both directions are checked independently. A payer rate gets score 7 if it matches a hospital rate, AND a hospital rate gets score 7 if it matches a payer rate. Both validated rates compete in the final selection.

BRIT Accuracy

A second scoring pass runs after all rate sources — Benchmarks, Raw, Imputations, and Transformations — are merged into one wide table per ROID. Each provider/rate type gets its own accuracy SQL with type-specific outlier bounds and scoring logic.

Why a Second Pass?

Raw accuracy scores all payer and hospital rate columns immediately after combined_raw is built — before imputations or benchmarks exist. BRIT accuracy re-scores everything after combined_brit merges all four layers, so imputation columns get scored for the first time, and previously scored raw columns can be re-evaluated with more context.

BRIT Accuracy Tasks

Six type-specific tasks run in parallel, each reading from tmp_int_combined_brit. Medical is split into three sequential sub-steps:

Drugs — Drug-specific outlier bounds (0.8–10× Medicare for payer; 0.8–4× for hospital). DPR exception: payer drug rates up to 1000% of Medicare allowed. CDF scoring disabled (is_drug_code=true → CDF=0). → tmp_int_accuracy_brit_drugs
Labs — Tight bounds: 0.2–4.5× Medicare. Lab rates cluster near Medicare, so the narrow window is appropriate. → tmp_int_accuracy_brit_labs
Medical — Rates — Scores all payer and hospital raw+transformation columns for non-IP (0.5–30×) and IP (0.9–10×) medical codes. Builds payer_rates_array and hospital_rates_array for counterparty checking. → tmp_int_accuracy_brit_medical_rates
Medical — Imputations — Scores imputation columns for medical ROIDs separately. Imputed rates use a different score ceiling (max score 4, not 7) — they can never beat a validated raw rate. → tmp_int_accuracy_brit_medical_imputations
Medical — Join — Combines medical rates + medical imputations into the final medical accuracy table. → tmp_int_accuracy_brit_medical
Physician Groups — PG-specific: anesthesia codes get a hard dollar cap ( $1–$ 200) in addition to the 0.5–5.5× Medicare multiplier bound. → tmp_int_accuracy_brit_physician_groups
DME — DME codes: 0.5–5.5× Medicare bounds. → tmp_int_accuracy_brit_dme
Urgent Care — Urgent care ROIDs scored separately. Applies standard medical bounds within the urgent care provider type. → tmp_int_accuracy_brit_urgent_care
BRIT Union — All 6 type-specific tables unioned into the final BRIT accuracy table. → tmp_int_accuracy_brit

What Changes Between Raw and BRIT Accuracy?

Dimension	Raw Accuracy	BRIT Accuracy
Input table	`tmp_int_combined_raw`	`tmp_int_combined_brit`
Columns scored	Payer + hospital raw columns only	Raw + transformed + imputed + benchmark columns
Chunking	By payer	By rate/provider type (parallel)
Counterparty arrays	Built here — used for raw score 7	Rebuilt with full data — final validated scores
Type-specific logic	None — one SQL for all types	Separate SQL per type (drugs, labs, medical, PG, DME, UC)

info

Medical 3-step design: Medical rates and medical imputations are scored separately before joining. This lets the imputation scoring use different score ceilings (imputed rates cannot reach score 7) without complicating the main rates SQL.

info

Output of BRIT accuracy = input to rate selection. tmp_int_accuracy_brit is the final wide table with every rate column and its _validation_score for every ROID. Rate selection reads this table and picks the column with the highest score.

Score Hierarchy

Complete mapping from internal validation scores to user-facing confidence scores.

Internal Score	Label	Confidence (0–5)	Meaning	Decimal Component
7.x	Validated	5	Payer + hospital agree within ±20%	`rate / 1e8` (prefer higher rates)
6.x	Raw, not outlier	4	Single source, within bounds	CDF value (prefer common rates)
5.x	Benchmark-validated	3	Transformed rate within 95-1000% of Medicare	CDF value
4.x	Not outlier	2	Within bounds, no validation	CDF value
3.x	Imputation benchmark	3	Imputation, benchmark-based	CDF value
2.x	Imputation not outlier	2	Imputation within bounds	CDF value
1.x	Outlier	1	Outside all acceptable bounds	CDF value
0	No rate	0	NULL / no rate available	—

info

The decimal tiebreaker is everything. Among 100 payer rates that all score 6.x (raw, not outlier), the one with the highest CDF (most common rate value) wins. Among validated rates (7.x), the one with the highest dollar amount wins. This ensures deterministic selection without arbitrary ordering.

Overview​

Why Accuracy Scores Exist​

Two Rounds of Scoring​

Score Scale (0–7) vs Canonical Rate Score (1–5)​

Outlier Bounds​

Normal CDF Scoring​

Counterparty Validation (Score = 7)​

The Logic​

BRIT Accuracy​

Why a Second Pass?​

BRIT Accuracy Tasks​

What Changes Between Raw and BRIT Accuracy?​

Score Hierarchy​

On this page:

Overview

Why Accuracy Scores Exist

Two Rounds of Scoring

Score Scale (0–7) vs Canonical Rate Score (1–5)

Outlier Bounds

Normal CDF Scoring

Counterparty Validation (Score = 7)

The Logic

BRIT Accuracy

Why a Second Pass?

BRIT Accuracy Tasks

What Changes Between Raw and BRIT Accuracy?

Score Hierarchy