Skip to main content
Version: 3.0

Provider Spine

The provider spine aggregates NPIs into Clear Rates's internal provider_id universe. Each provider_id represents a distinct provider entity — typically a facility or practice group — and carries an array of associated NPIs, a provider type, and any special attributes (e.g., carrier_locality for MPFS pricing). The spine is built from per-type source tables in tq_production.spines, with a Jinja2 conditional UNION ALL that includes only the provider types active for a given DAG run.

Two-Layer CTE Pattern

Each provider type is built in two CTEs:

Layer 1 — NPI array CTE: Reads the per-type source table and aggregates NPIs into an array per facility. Applies any type-specific deduplication or filtering at this stage.

Layer 2 — Final provider CTE: Assigns provider_id, attaches metadata (state, provider_type, carrier_locality), and shapes the output to match the spine schema.

The two-layer pattern keeps dedup logic self-contained per type and makes it easy to add or modify a provider type without touching the others.

Jinja2 UNION ALL Pattern

Provider Spine — Conditional UNION ALL
-- provider.sql (simplified)
-- Only provider types in the 'provider_types' DAG param are included

WITH

{% if 'Hospital' in provider_types %}
hospital_npi AS (
SELECT
provider_id,
ARRAY_AGG(DISTINCT CAST(npi AS VARCHAR)) AS npi,
state,
'Hospital' AS provider_type,
NULL AS carrier_locality
FROM tq_production.spines.hospitals
GROUP BY provider_id, state
),
{% endif %}

{% if 'ASC' in provider_types %}
asc_npi AS (
SELECT
provider_id,
ARRAY_AGG(DISTINCT CAST(npi AS VARCHAR)) AS npi,
state,
'ASC' AS provider_type,
NULL AS carrier_locality
FROM tq_production.spines.ascs
GROUP BY provider_id, state
),
{% endif %}

{% if 'PhysicianGroup' in provider_types %}
pg_npi AS (
SELECT
provider_id,
ARRAY_AGG(DISTINCT CAST(npi AS VARCHAR)) AS npi,
state,
'PhysicianGroup' AS provider_type,
carrier_locality -- Used for MPFS pricing
FROM tq_production.spines.physician_groups
GROUP BY provider_id, state, carrier_locality
),
{% endif %}

-- ... Lab, Imaging, Dialysis, DME, UrgentCare follow same pattern

all_providers AS (
{% if 'Hospital' in provider_types %}
SELECT * FROM hospital_npi
{% if other_types_active %} UNION ALL {% endif %}
{% endif %}
{% if 'ASC' in provider_types %}
SELECT * FROM asc_npi
{% if more_types_active %} UNION ALL {% endif %}
{% endif %}
-- ... remaining types
)

SELECT
provider_id,
npi, -- ARRAY[VARCHAR]
state,
provider_type,
carrier_locality
FROM all_providers

Lab Filtering Walkthrough

Labs require special handling because reference labs (Quest, LabCorp) are available nationally but must be scoped by state.

Lab provider filtering — 3 stages
Stage 1 — core_rates filter: Lab NPIs are sourced from tq_production.spines.labs. The initial CTE filters to labs that appear in core_rates with at least one rate, ensuring only labs with actual MRF data enter the spine.
Stage 2 — national_payer_coverage override: Some national labs (Quest, LabCorp) have thin core_rates coverage in certain states. A national_payer_coverage override table re-admits these labs for states where they are known to operate, even if core_rates has no rates.
Stage 3 — Quest/LabCorp dedup by state: Quest and LabCorp have many NPIs per state. The lab CTE deduplicates to one row per (lab_entity, state), aggregating NPIs into the array. This prevents the provider spine from having hundreds of Quest rows that would fan out the ROS unnecessarily.
The result is one provider_id per (lab entity × state), with all NPIs for that entity in that state collected into the npi array.

carrier_locality

Physician Group rows carry a carrier_locality column — a two-part string (carrier_number + locality_code, e.g., '01112') used in MPFS pricing. When Clear Rates computes the Medicare benchmark for a PG ROID, it looks up the MPFS fee schedule using the carrier_locality from the provider spine row for that provider_id. PGs without a carrier_locality fall back to a national average MPFS rate.

Provider Types

TypeSource TableSpecial HandlingScale
Hospitaltq_production.spines.hospitalsNone~6,000
ASCtq_production.spines.ascsExcluded when lookback_run=True~5,500
PhysicianGrouptq_production.spines.physician_groupscarrier_locality for MPFS; excluded when lookback_run=True~35,000
Labtq_production.spines.labscore_rates filter + national override + Quest/LabCorp dedup; excluded when lookback_run=True~500
Imagingtq_production.spines.imagingExcluded when lookback_run=True~3,000
Dialysistq_production.spines.dialysisExcluded when lookback_run=True~7,000
DMEtq_production.spines.dmeExcluded when lookback_run=True~2,000
UrgentCaretq_production.spines.urgent_careExcluded when lookback_run=True~10,000
npi is an ARRAY, not a scalar

The npi column is ARRAY[VARCHAR], not a scalar. When joining from core_rates (which stores provider_npi as bigint):

WHERE cr.provider_npi = ANY(CAST(ps.npi AS ARRAY(BIGINT)))

A scalar equality join will fail or return no rows.

Default provider_types includes all 8 types

The default provider_types DAG param lists all 8 provider types. When lookback_run=True, only Hospital is included — the Jinja2 conditional block omits all other provider type CTEs and UNION ALL branches entirely.