Expanding Scope
How to add payers, networks, billing codes, and providers to a CLD run. Scope is defined by the spine inputs — add an entity here and it flows through the entire pipeline.
Overview
Process:
- Identify what to add — Payer? Network under existing payer? New billing codes? New provider or provider type?
- Edit spine inputs — The right files depend on the entity type — see sub-pages for each type
- Re-run cld-utils (if codes) — Codeset tables are built by a separate DAG. Must re-run before the sub-DAG uses the new codes.
- Run CLD sub-DAG — New entities flow through the full pipeline automatically
Networks require the most coordination. A network is defined in network_mappings.py (single source of truth for the network spine) and in ppo_payer_keywords in variables_XXX.sql for hospital MRF matching. If the keywords don't match the normalized plan_name, hospital rates silently fail to map — no error is raised.
Adding a Payer
Adding a payer requires 3 changes in payer.sql and network configuration. Without network entries, the payer's rates can't be assigned to ROIDs.
Steps:
- Confirm MRF data — Does this payer have MRF data in
core_rates? Contact Derivative Assets team to label MRF files if not yet labeled. - Edit
payer.sql(3 changes) — Add topayer_idslist,definitive_payer_idCASE block, andblue_payerIN clause (if BCBS) →airflow_dags/.../sql/reference/spines/payer.sql - Add network mappings — Register payer networks in
network_mappings.pyandvariables_XXX.sql - (Optional)
payer_chunks.sql— Only needed if the payer needs a custom chunk assignment in cld-utils.
3 Changes in payer.sql
Change 1: payer_ids Jinja2 list
The universe of payers is a hardcoded Jinja2 list at the top of payer.sql. Add the new payer ID as a quoted string:
{% set payer_ids = [
'169', '42', '76', '643', ...
'999', ← new payer (use quoted string)
] %}
Change 2: definitive_payer_id CASE block
WHEN payer_payer_id = 999 THEN ARRAY[123456] -- one Definitive ID
-- or: ARRAY[123456, 789012] -- multiple entities
-- or: NULL -- no Definitive ID exists
Change 3: blue_payer IN clause (only add if this is a BCBS plan)
WHEN payer_payer_id IN (42, 43, 100, ..., 999) ← only if BCBS
THEN True
Payer without networks does nothing. A payer added to payer.sql but without network entries in network_mappings.py will have no network spine rows → no ROIDs → no rates in the output.
Definitive IDs can be NULL. Not all payers have Definitive Healthcare records. NULL is valid and doesn't break anything — it just means the payer won't appear in Definitive-enriched rollups.
Hospital-MRF-only payers: Some payers (e.g., Exchange plans) only appear in hospital MRF files, not payer MRF. These still need payer.sql + network entries. They won't have payer-side rates but will have hospital-side rates mapped via the Plan Bridge.
Adding a Network
A network is defined in two places. network_mappings.py is the single source of truth for the network spine. ppo_payer_keywords in variables_XXX.sql controls hospital MRF matching and must be updated separately.
Step 1: network_mappings.py
Path: airflow_dags/.../core_licensable_data_sub_dag/utils/network_mappings.py
Each network type has its own Python list of tuples. Tuple format: (payer_id, [mrf_labels], network_name, [states])
payer_id— string payer ID matchingpayer.sqlmrf_labels— exactplan_namestrings from payer MRF (core_rates). Used to assignnetwork_idto payer MRF rates. Empty list for Exchange/MA networks (no payer MRF).network_name— the canonical name; becomesnetwork_namein the network spine and is hashed to producenetwork_idstates— list of state codes for state-specific networks, or[None]for national
| List variable | Network type | SQL template |
|---|---|---|
PPO_NETWORK_MAPPINGS | PPO | network_base.sql |
HMO_NETWORK_MAPPINGS | HMO | network_hmo.sql |
NARROW_NETWORK_MAPPINGS | Narrow | network_narrow.sql |
EXCHANGE_NETWORK_MAPPINGS | Exchange | network_exchange.sql |
MEDICARE_ADVANTAGE_NETWORK_MAPPINGS | Medicare Advantage | network_medicare_advantage.sql |
Step 2: variables_XXX.sql
Path: airflow_dags/.../sql/raw/plan_bridge/variables_ppo.sql (and HMO/narrow/exchange/MA equivalents)
1. ppo_payer_keywords (critical)
Dict of payer_id → [keyword list]. Determines which hospital rates belong to this payer. Hospital MRF plan_name is normalized before matching: REPLACE(REPLACE(UPPER(plan_name),'_',' '),' ','') — all-caps, underscores→spaces, spaces removed. Write keywords in that form.
payer_id must be present even with an empty list. If a payer_id is absent from ppo_payer_keywords, the plan bridge skips that payer entirely — none of their hospital rates will be matched. An empty list [] means "match via general/network keywords only."
Keyword normalization example:
- Hospital MRF
plan_name:"Choice Plus" - Normalized:
"CHOICEPLUS" - Keyword match:
'CHOICEPLUS'inppo_payer_keywords[643]→ match ✓
Rate is attributed to payer 643 (UHC); next step routes it to the correct network.
2. ppo_product_network_mapping (critical)
Dict of payer_id → network_name (or list, or state-keyed dict). Once a rate is attributed to a payer, this mapping routes it to the specific CLD network. The network_name values here must match network_mappings.py exactly — same case, same spaces.
| Pattern | Example | When to use |
|---|---|---|
| Single string | 403: 'Group PPO' | Payer has one PPO network |
| List of strings | 643: ['Choice Plus', 'Select EPO'] | Payer has multiple PPO networks |
| State-keyed dict | 42: {'NY': 'NY PPO', 'CA': 'CA Blue Cross PPO', ...} | Payer has different network names per state (e.g., BCBS affiliates) |
3. ppo_exclusions / ppo_exclusion_exact_matches
Global exclusion keyword lists (normalized form). Plan names containing any of these are excluded from PPO matching — Medicare, Medicaid, OON, behavioral health, government programs, exchange plans, etc.
ppo_exclusion_exact_matches is stricter — the normalized plan_name must equal the string exactly (e.g., 'BH', 'HMO', 'MA').
4. ppo_payer_exclusions
Dict of (payer_id, state) → [exclusion keywords]. Payer- and state-specific exclusions for plan names that would otherwise match the global keywords but should be excluded for this payer.
Silent failure mode. If keywords don't match the normalized hospital MRF plan_name, no error is raised — hospital rates are simply not assigned to the network. Always verify with a query on hospital_rates.plan_name after adding a network, then check tmp_ref_plan_bridge for matched rows.
Empty mrf_labels for Exchange/MA. Exchange and Medicare Advantage networks don't use payer MRF data — their rates come entirely from hospital MRF. Set mrf_labels=[] for these network types in network_mappings.py.
Adding Billing Codes
Most codesets are managed in the cld-utils DAG, not the sub-DAG. Edit the codeset SQL, re-run cld-utils, then update params.py to point to the new table.
Steps:
- Identify the right codeset SQL — Match provider type + bill type to the correct cld-utils SQL file →
cld-utils/sql/spines/ - Edit manual additions array — Append new billing codes to the
ARRAY[...]in the SQL file - Re-run cld-utils DAG —
core_licensable_data_utilsDAG builds new date-stamped table: e.g.,outpatient_codeset_2026_04_11 - Update
params.pypointer — Change the table name inparams.pyto point to the new date-stamped table - Run CLD sub-DAG —
code.sqlreads from the updated table; new codes appear in code spine → rate object space
Codeset Files
| Provider Type | Bill Type | cld-utils SQL File | Managed By |
|---|---|---|---|
| Hospital | Outpatient | sql/spines/outpatient_codeset.sql | cld-utils DAG |
| ASC | Outpatient | sql/spines/asc_codeset.sql | cld-utils DAG |
| Laboratory | Professional | sql/spines/hospital_lab_codes.sql | cld-utils DAG |
| Physician Group | Professional | build/physician_group_codeset/pg_codeset.py | cld-utils Python builder |
| Dialysis | Professional | sql/spines/dialysis_center_codeset.sql | cld-utils DAG |
| DME | Professional | sql/spines/dme_codeset.sql | cld-utils DAG |
| Urgent Care | Professional | sql/spines/urgent_care_codeset.sql | cld-utils DAG |
MS-DRG, APR-DRG, SSP line codes, J1 HCPCS, and Life Sciences drugs are baked directly into code.sql in the sub-DAG — they don't go through cld-utils.
params.py date-suffix pattern
Codeset tables in tq_intermediate.cld_utils have date-stamped names — they're versioned snapshots. After rebuilding a codeset, update the pointer in params.py:
# airflow_dags/.../params.py
OUTPATIENT_CODESET = "tq_intermediate.cld_utils.outpatient_codeset_2026_03_24"
ASC_CODESET = "tq_intermediate.cld_utils.asc_codeset_2026_03_23"
CLD_PHYSICIAN_GROUP_CODESET = "tq_intermediate.cld_utils.pg_codeset_2026_03_21"
LAB_HOSPITAL_AND_PGS_CODESET = "tq_intermediate.cld_utils.lab_template_codes_2026_03_06"
DME_CODESET = "tq_intermediate.cld_utils.dme_codeset_2026_03_14"
DIALYSIS_CENTER_CODESET = "tq_intermediate.cld_utils.dialysis_center_codeset_2026_03_14"
URGENT_CARE_CODESET = "tq_intermediate.cld_utils.urgent_care_codeset_2026_03_14"
Stale pointer = stale codes. If you rebuild the codeset but forget to update params.py, the sub-DAG continues reading the old table. The new codes won't appear in the code spine.
Manual additions pattern
Each codeset SQL has a manual additions array at the bottom. Append new codes as quoted strings:
-- manual HCPCS additions
SELECT
'Outpatient' as bill_type,
'HCPCS' as billing_code_type,
billing_code
FROM (
SELECT billing_code
FROM unnest(array[
'29826', '22558', '43775',
'99999' ← append new code here
]) t(billing_code)
) AS manual_codes
/add-codes skill: For routine code additions, the /add-codes skill automates this process. It reads a spec xlsx, routes each row to the correct SQL file based on bill_type and provider_type, and inserts codes into the manual additions array.
Adding Providers
CLD draws providers from tq_production.spines.* tables. If a provider isn't in Spines, the Spines team must add it first.
Decision tree:
Does a Spine exist for this provider type?
❌ No → Contact Spines team to create a new spine table
✅ Yes ↓
Is this provider already in the spine?
❌ No → Contact Spines team to add the provider to the spine
✅ Yes ↓
Does this provider type already exist in CLD?
✅ Yes → No action needed; provider auto-appears in next CLD refresh
❌ No → CLD team must add the provider type
(edit code.sql, rate_object_space.sql, plausibility tables;
use /add-provider-type skill)
Why most providers require no action:
Provider spines are rebuilt from scratch each CLD run from tq_production.spines.* tables. Once a provider is added to Spines and their type is already in CLD, they automatically appear in the next run — no CLD edits needed.
provider_types parameter:
Default list: ['Hospital', 'ASC', 'Physician Group', 'Laboratory', 'Imaging Center', 'Dialysis Center', 'DME', 'Urgent Care']
For lookback runs (lookback_run=True), ASC, PG, Dialysis, DME, and Urgent Care are automatically excluded.
Manual PG additions:
Manual additions are used for Physician Groups that are:
- Missing from Spines: New PGs not yet added by the Spines team
- DPR+ missing providers: PGs flagged in the CLEAR_RATES_MISSING_PROVIDER report
- Synthetic PGs: Virtual PGs with constructed identifiers (not in Spines at all)
See pg_identifiers.sql in the sub-DAG for the manual additions structure.
Provider Types
| Type | Source Spine Table | Special Handling | Plausibility Required? |
|---|---|---|---|
| Hospital | spines_provider_hospitals | Subtypes: Critical Access, Children's, Rehab, LTAC | No |
| ASC | spines_provider_ascs | AmSurg specialty matching | Yes (ASC plausibility) |
| Physician Group | spines_provider_physician_groups | Associated TINs, manual additions | Yes (PG plausibility) |
| Laboratory | spines_provider_laboratories | national_payer_coverage filter; Quest/LabCorp dedup by state | No |
| Imaging Center | spines_provider_imaging_centers | Can share codes with PG | No |
| Dialysis Center | spines_provider_dialysis_centers | Excluded from lookback runs | No |
| DME | spines_provider_dme | Excluded from lookback runs | Yes (DME plausibility) |
| Urgent Care | spines_provider_urgent_care | Excluded from lookback runs | No |