Provider Spine
The provider spine aggregates NPIs into Clear Rates's internal provider_id universe. Each provider_id represents a distinct provider entity — typically a facility or practice group — and carries an array of associated NPIs, a provider type, and any special attributes (e.g., carrier_locality for MPFS pricing). The spine is built from per-type source tables in tq_production.spines, with a Jinja2 conditional UNION ALL that includes only the provider types active for a given DAG run.
Two-Layer CTE Pattern
Each provider type is built in two CTEs:
Layer 1 — NPI array CTE: Reads the per-type source table and aggregates NPIs into an array per facility. Applies any type-specific deduplication or filtering at this stage.
Layer 2 — Final provider CTE: Assigns provider_id, attaches metadata (state, provider_type, carrier_locality), and shapes the output to match the spine schema.
The two-layer pattern keeps dedup logic self-contained per type and makes it easy to add or modify a provider type without touching the others.
Jinja2 UNION ALL Pattern
-- provider.sql (simplified)
-- Only provider types in the 'provider_types' DAG param are included
WITH
{% if 'Hospital' in provider_types %}
hospital_npi AS (
SELECT
provider_id,
ARRAY_AGG(DISTINCT CAST(npi AS VARCHAR)) AS npi,
state,
'Hospital' AS provider_type,
NULL AS carrier_locality
FROM tq_production.spines.hospitals
GROUP BY provider_id, state
),
{% endif %}
{% if 'ASC' in provider_types %}
asc_npi AS (
SELECT
provider_id,
ARRAY_AGG(DISTINCT CAST(npi AS VARCHAR)) AS npi,
state,
'ASC' AS provider_type,
NULL AS carrier_locality
FROM tq_production.spines.ascs
GROUP BY provider_id, state
),
{% endif %}
{% if 'PhysicianGroup' in provider_types %}
pg_npi AS (
SELECT
provider_id,
ARRAY_AGG(DISTINCT CAST(npi AS VARCHAR)) AS npi,
state,
'PhysicianGroup' AS provider_type,
carrier_locality -- Used for MPFS pricing
FROM tq_production.spines.physician_groups
GROUP BY provider_id, state, carrier_locality
),
{% endif %}
-- ... Lab, Imaging, Dialysis, DME, UrgentCare follow same pattern
all_providers AS (
{% if 'Hospital' in provider_types %}
SELECT * FROM hospital_npi
{% if other_types_active %} UNION ALL {% endif %}
{% endif %}
{% if 'ASC' in provider_types %}
SELECT * FROM asc_npi
{% if more_types_active %} UNION ALL {% endif %}
{% endif %}
-- ... remaining types
)
SELECT
provider_id,
npi, -- ARRAY[VARCHAR]
state,
provider_type,
carrier_locality
FROM all_providers
Lab Filtering Walkthrough
Labs require special handling because reference labs (Quest, LabCorp) are available nationally but must be scoped by state.
tq_production.spines.labs. The initial CTE filters to labs that appear in core_rates with at least one rate, ensuring only labs with actual MRF data enter the spine.national_payer_coverage override table re-admits these labs for states where they are known to operate, even if core_rates has no rates.carrier_locality
Physician Group rows carry a carrier_locality column — a two-part string (carrier_number + locality_code, e.g., '01112') used in MPFS pricing. When Clear Rates computes the Medicare benchmark for a PG ROID, it looks up the MPFS fee schedule using the carrier_locality from the provider spine row for that provider_id. PGs without a carrier_locality fall back to a national average MPFS rate.
Provider Types
| Type | Source Table | Special Handling | Scale |
|---|---|---|---|
| Hospital | tq_production.spines.hospitals | None | ~6,000 |
| ASC | tq_production.spines.ascs | Excluded when lookback_run=True | ~5,500 |
| PhysicianGroup | tq_production.spines.physician_groups | carrier_locality for MPFS; excluded when lookback_run=True | ~35,000 |
| Lab | tq_production.spines.labs | core_rates filter + national override + Quest/LabCorp dedup; excluded when lookback_run=True | ~500 |
| Imaging | tq_production.spines.imaging | Excluded when lookback_run=True | ~3,000 |
| Dialysis | tq_production.spines.dialysis | Excluded when lookback_run=True | ~7,000 |
| DME | tq_production.spines.dme | Excluded when lookback_run=True | ~2,000 |
| UrgentCare | tq_production.spines.urgent_care | Excluded when lookback_run=True | ~10,000 |
The npi column is ARRAY[VARCHAR], not a scalar. When joining from core_rates (which stores provider_npi as bigint):
WHERE cr.provider_npi = ANY(CAST(ps.npi AS ARRAY(BIGINT)))
A scalar equality join will fail or return no rows.
The default provider_types DAG param lists all 8 provider types. When lookback_run=True, only Hospital is included — the Jinja2 conditional block omits all other provider type CTEs and UNION ALL branches entirely.