Version: 3.0

Selection Algorithm

Find max score, use its array index to select rate and all metadata.

Core Algorithm

Find best score

array_max(rate_score_array) returns the highest accuracy score across all candidate rates for this ROID.

Find best index

ARRAY_POSITION(rate_score_array, best_score) returns the 1-based position of the winning rate in the arrays.

Extract canonical fields

All canonical_* output fields are populated by indexing into the parallel arrays at best_idx: rate_array[best_idx], source_array[best_idx], rate_type_array[best_idx], etc.

→ tmp_int_combined_no_whisp

Special Rules

Score = 7 → "payer_hospital": When the best score is 7.x, canonical_rate_source is always set to "payer_hospital" regardless of which specific column won. Score 7 means both sources independently agreed — the source label reflects that bilateral agreement, not a single winner.
Multiple best indices: best_payer_idx, best_hospital_idx, and best_idx_no_impute are tracked separately alongside best_idx. These support downstream analysis of what each source individually would have chosen.
Gross charge type derivation: canonical_gross_charge_type is inferred from the rate_type column name — no separate lookup needed.
NULL rates: ROIDs where every column scores 0 get canonical_rate = NULL and canonical_rate_score = 0. The ROID row is preserved in the output — a NULL canonical rate represents a genuine coverage gap, not a processing error.

Determinism

The algorithm is fully deterministic. Given the same input arrays, it always produces the same canonical rate. The CDF and rate/1e8 tiebreakers ensure that no two rates have exactly the same score in practice — ties are theoretically impossible once the decimal component is included.

Core Algorithm​

Special Rules​

On this page:

Core Algorithm

Special Rules