Spines are Stage 1 of the Clear Rates pipeline. They build canonical reference tables for payers, networks, providers, and billing codes — the four dimensions that define the Rate Object Space. Every downstream stage joins against spine tables to resolve identifiers and scope what gets priced.
A payer does not exist in Clear Rates until it appears in the payer spine. A provider type is not priced until its NPIs are enrolled in the provider spine. The spines collectively define the boundaries of Clear Rates's universe.
Three independent spines built in parallel. Payer spine reads a hardcoded Jinja2 list (~90 payers). Code spine unions 11 billing code sources. Network spine builds from network_mappings.py via 7 SQL files.
→ spines_payer / spines_code / spines_network
2
Plausibility Tables
asc_plausibility and dme_plausibility cross-reference ASC and DME code sets against provider types. code_plausibility joins all three spines to produce valid (code, provider_type) combinations for the ROS.
Reads per-type source tables from tq_production.spines. Aggregates NPIs into arrays, assigns internal provider_id. Conditionally includes provider types based on lookback_run param.
→ spines_provider
4
Provider Spine Manual Additions
Appends manually curated provider rows — edge cases not captured by automated source tables (e.g., recently opened facilities, corrections).
→ spines_provider (updated)
5
Spine Validations (blocking)
Runs a suite of data quality checks across all spine tables. Any failure blocks the DAG — the Rate Object Space cannot be built on bad spine data.
Parallel first wave: payer_spine, code_spine, and network_spine run simultaneously — they have no mutual dependencies.
Plausibility depends on all three: asc_plausibility and code_plausibility wait for all three spines before running, even though technically only code_spine is needed. The actual Airflow edge is broader than the comment in the SQL suggests.
Provider spine is last: provider_spine waits for code_plausibility and dme_plausibility before running, then manual_additions appends to it.
Validations gate the ROS: spine_validations must pass before rate_object_space is built.
Spines gate scope — adding a payer to payer.sql makes its rates flow through the full pipeline. Removing a payer removes all its ROIDs from the output. The spine files are the primary way Clear Rates's scope is managed.
Provider ID vs NPI — provider_id is Clear Rates's internal identifier assigned in the provider spine. Each provider_id corresponds to one or more NPIs stored as an ARRAY. Core rates joins against this array, not a scalar NPI.
bypass_validations — a DAG param that skips all validation task groups. Defaults to False. Never bypass in production without explicit approval — failing validations indicate real data problems that will propagate downstream.
lookback_run — when True, the provider spine excludes ASC, PG, Dialysis, DME, and Urgent Care provider types. Only Hospital providers are included. Used for historical backfill runs where non-hospital coverage is not needed.
Check the actual task graph, not the comments
The sub-DAG code has a comment "asc_plausibility only needs code_spines" — but the actual Airflow edge waits for all 3 spines. Always check the actual task graph in __init__.py over comments in individual SQL files.