Skip to main content
Version: 3.0

Expanding Scope

How to add payers, networks, billing codes, and providers to a CLD run. Scope is defined by the spine inputs — add an entity here and it flows through the entire pipeline.

Overview

Process:

  1. Identify what to add — Payer? Network under existing payer? New billing codes? New provider or provider type?
  2. Edit spine inputs — The right files depend on the entity type — see sub-pages for each type
  3. Re-run cld-utils (if codes) — Codeset tables are built by a separate DAG. Must re-run before the sub-DAG uses the new codes.
  4. Run CLD sub-DAG — New entities flow through the full pipeline automatically
info

Networks require the most coordination. A network is defined in network_mappings.py (single source of truth for the network spine) and in ppo_payer_keywords in variables_XXX.sql for hospital MRF matching. If the keywords don't match the normalized plan_name, hospital rates silently fail to map — no error is raised.


Adding a Payer

Adding a payer requires 3 changes in payer.sql and network configuration. Without network entries, the payer's rates can't be assigned to ROIDs.

Steps:

  1. Confirm MRF data — Does this payer have MRF data in core_rates? Contact Derivative Assets team to label MRF files if not yet labeled.
  2. Edit payer.sql (3 changes) — Add to payer_ids list, definitive_payer_id CASE block, and blue_payer IN clause (if BCBS) → airflow_dags/.../sql/reference/spines/payer.sql
  3. Add network mappings — Register payer networks in network_mappings.py and variables_XXX.sql
  4. (Optional) payer_chunks.sql — Only needed if the payer needs a custom chunk assignment in cld-utils.

3 Changes in payer.sql

Change 1: payer_ids Jinja2 list

The universe of payers is a hardcoded Jinja2 list at the top of payer.sql. Add the new payer ID as a quoted string:

{% set payer_ids = [
'169', '42', '76', '643', ...
'999', ← new payer (use quoted string)
] %}

Change 2: definitive_payer_id CASE block

WHEN payer_payer_id = 999 THEN ARRAY[123456]     -- one Definitive ID
-- or: ARRAY[123456, 789012] -- multiple entities
-- or: NULL -- no Definitive ID exists

Change 3: blue_payer IN clause (only add if this is a BCBS plan)

WHEN payer_payer_id IN (42, 43, 100, ..., 999)   ← only if BCBS
THEN True
warning

Payer without networks does nothing. A payer added to payer.sql but without network entries in network_mappings.py will have no network spine rows → no ROIDs → no rates in the output.

warning

Definitive IDs can be NULL. Not all payers have Definitive Healthcare records. NULL is valid and doesn't break anything — it just means the payer won't appear in Definitive-enriched rollups.

info

Hospital-MRF-only payers: Some payers (e.g., Exchange plans) only appear in hospital MRF files, not payer MRF. These still need payer.sql + network entries. They won't have payer-side rates but will have hospital-side rates mapped via the Plan Bridge.


Adding a Network

A network is defined in two places. network_mappings.py is the single source of truth for the network spine. ppo_payer_keywords in variables_XXX.sql controls hospital MRF matching and must be updated separately.

Step 1: network_mappings.py

Path: airflow_dags/.../core_licensable_data_sub_dag/utils/network_mappings.py

Each network type has its own Python list of tuples. Tuple format: (payer_id, [mrf_labels], network_name, [states])

  • payer_id — string payer ID matching payer.sql
  • mrf_labels — exact plan_name strings from payer MRF (core_rates). Used to assign network_id to payer MRF rates. Empty list for Exchange/MA networks (no payer MRF).
  • network_name — the canonical name; becomes network_name in the network spine and is hashed to produce network_id
  • states — list of state codes for state-specific networks, or [None] for national
List variableNetwork typeSQL template
PPO_NETWORK_MAPPINGSPPOnetwork_base.sql
HMO_NETWORK_MAPPINGSHMOnetwork_hmo.sql
NARROW_NETWORK_MAPPINGSNarrownetwork_narrow.sql
EXCHANGE_NETWORK_MAPPINGSExchangenetwork_exchange.sql
MEDICARE_ADVANTAGE_NETWORK_MAPPINGSMedicare Advantagenetwork_medicare_advantage.sql

Step 2: variables_XXX.sql

Path: airflow_dags/.../sql/raw/plan_bridge/variables_ppo.sql (and HMO/narrow/exchange/MA equivalents)

1. ppo_payer_keywords (critical)

Dict of payer_id → [keyword list]. Determines which hospital rates belong to this payer. Hospital MRF plan_name is normalized before matching: REPLACE(REPLACE(UPPER(plan_name),'_',' '),' ','') — all-caps, underscores→spaces, spaces removed. Write keywords in that form.

warning

payer_id must be present even with an empty list. If a payer_id is absent from ppo_payer_keywords, the plan bridge skips that payer entirely — none of their hospital rates will be matched. An empty list [] means "match via general/network keywords only."

Keyword normalization example:

  1. Hospital MRF plan_name: "Choice Plus"
  2. Normalized: "CHOICEPLUS"
  3. Keyword match: 'CHOICEPLUS' in ppo_payer_keywords[643] → match ✓
Result

Rate is attributed to payer 643 (UHC); next step routes it to the correct network.

2. ppo_product_network_mapping (critical)

Dict of payer_id → network_name (or list, or state-keyed dict). Once a rate is attributed to a payer, this mapping routes it to the specific CLD network. The network_name values here must match network_mappings.py exactly — same case, same spaces.

PatternExampleWhen to use
Single string403: 'Group PPO'Payer has one PPO network
List of strings643: ['Choice Plus', 'Select EPO']Payer has multiple PPO networks
State-keyed dict42: {'NY': 'NY PPO', 'CA': 'CA Blue Cross PPO', ...}Payer has different network names per state (e.g., BCBS affiliates)

3. ppo_exclusions / ppo_exclusion_exact_matches

Global exclusion keyword lists (normalized form). Plan names containing any of these are excluded from PPO matching — Medicare, Medicaid, OON, behavioral health, government programs, exchange plans, etc.

ppo_exclusion_exact_matches is stricter — the normalized plan_name must equal the string exactly (e.g., 'BH', 'HMO', 'MA').

4. ppo_payer_exclusions

Dict of (payer_id, state) → [exclusion keywords]. Payer- and state-specific exclusions for plan names that would otherwise match the global keywords but should be excluded for this payer.

warning

Silent failure mode. If keywords don't match the normalized hospital MRF plan_name, no error is raised — hospital rates are simply not assigned to the network. Always verify with a query on hospital_rates.plan_name after adding a network, then check tmp_ref_plan_bridge for matched rows.

warning

Empty mrf_labels for Exchange/MA. Exchange and Medicare Advantage networks don't use payer MRF data — their rates come entirely from hospital MRF. Set mrf_labels=[] for these network types in network_mappings.py.


Adding Billing Codes

Most codesets are managed in the cld-utils DAG, not the sub-DAG. Edit the codeset SQL, re-run cld-utils, then update params.py to point to the new table.

Steps:

  1. Identify the right codeset SQL — Match provider type + bill type to the correct cld-utils SQL file → cld-utils/sql/spines/
  2. Edit manual additions array — Append new billing codes to the ARRAY[...] in the SQL file
  3. Re-run cld-utils DAGcore_licensable_data_utils DAG builds new date-stamped table: e.g., outpatient_codeset_2026_04_11
  4. Update params.py pointer — Change the table name in params.py to point to the new date-stamped table
  5. Run CLD sub-DAGcode.sql reads from the updated table; new codes appear in code spine → rate object space

Codeset Files

Provider TypeBill Typecld-utils SQL FileManaged By
HospitalOutpatientsql/spines/outpatient_codeset.sqlcld-utils DAG
ASCOutpatientsql/spines/asc_codeset.sqlcld-utils DAG
LaboratoryProfessionalsql/spines/hospital_lab_codes.sqlcld-utils DAG
Physician GroupProfessionalbuild/physician_group_codeset/pg_codeset.pycld-utils Python builder
DialysisProfessionalsql/spines/dialysis_center_codeset.sqlcld-utils DAG
DMEProfessionalsql/spines/dme_codeset.sqlcld-utils DAG
Urgent CareProfessionalsql/spines/urgent_care_codeset.sqlcld-utils DAG

MS-DRG, APR-DRG, SSP line codes, J1 HCPCS, and Life Sciences drugs are baked directly into code.sql in the sub-DAG — they don't go through cld-utils.

params.py date-suffix pattern

Codeset tables in tq_intermediate.cld_utils have date-stamped names — they're versioned snapshots. After rebuilding a codeset, update the pointer in params.py:

# airflow_dags/.../params.py
OUTPATIENT_CODESET = "tq_intermediate.cld_utils.outpatient_codeset_2026_03_24"
ASC_CODESET = "tq_intermediate.cld_utils.asc_codeset_2026_03_23"
CLD_PHYSICIAN_GROUP_CODESET = "tq_intermediate.cld_utils.pg_codeset_2026_03_21"
LAB_HOSPITAL_AND_PGS_CODESET = "tq_intermediate.cld_utils.lab_template_codes_2026_03_06"
DME_CODESET = "tq_intermediate.cld_utils.dme_codeset_2026_03_14"
DIALYSIS_CENTER_CODESET = "tq_intermediate.cld_utils.dialysis_center_codeset_2026_03_14"
URGENT_CARE_CODESET = "tq_intermediate.cld_utils.urgent_care_codeset_2026_03_14"
warning

Stale pointer = stale codes. If you rebuild the codeset but forget to update params.py, the sub-DAG continues reading the old table. The new codes won't appear in the code spine.

Manual additions pattern

Each codeset SQL has a manual additions array at the bottom. Append new codes as quoted strings:

-- manual HCPCS additions
SELECT
'Outpatient' as bill_type,
'HCPCS' as billing_code_type,
billing_code
FROM (
SELECT billing_code
FROM unnest(array[
'29826', '22558', '43775',
'99999' ← append new code here
]) t(billing_code)
) AS manual_codes
info

/add-codes skill: For routine code additions, the /add-codes skill automates this process. It reads a spec xlsx, routes each row to the correct SQL file based on bill_type and provider_type, and inserts codes into the manual additions array.


Adding Providers

CLD draws providers from tq_production.spines.* tables. If a provider isn't in Spines, the Spines team must add it first.

Decision tree:

Does a Spine exist for this provider type?
❌ No → Contact Spines team to create a new spine table
✅ Yes ↓

Is this provider already in the spine?
❌ No → Contact Spines team to add the provider to the spine
✅ Yes ↓

Does this provider type already exist in CLD?
✅ Yes → No action needed; provider auto-appears in next CLD refresh
❌ No → CLD team must add the provider type
(edit code.sql, rate_object_space.sql, plausibility tables;
use /add-provider-type skill)

Why most providers require no action:

Provider spines are rebuilt from scratch each CLD run from tq_production.spines.* tables. Once a provider is added to Spines and their type is already in CLD, they automatically appear in the next run — no CLD edits needed.

provider_types parameter:

Default list: ['Hospital', 'ASC', 'Physician Group', 'Laboratory', 'Imaging Center', 'Dialysis Center', 'DME', 'Urgent Care']

For lookback runs (lookback_run=True), ASC, PG, Dialysis, DME, and Urgent Care are automatically excluded.

Manual PG additions:

Manual additions are used for Physician Groups that are:

  • Missing from Spines: New PGs not yet added by the Spines team
  • DPR+ missing providers: PGs flagged in the CLEAR_RATES_MISSING_PROVIDER report
  • Synthetic PGs: Virtual PGs with constructed identifiers (not in Spines at all)

See pg_identifiers.sql in the sub-DAG for the manual additions structure.

Provider Types

TypeSource Spine TableSpecial HandlingPlausibility Required?
Hospitalspines_provider_hospitalsSubtypes: Critical Access, Children's, Rehab, LTACNo
ASCspines_provider_ascsAmSurg specialty matchingYes (ASC plausibility)
Physician Groupspines_provider_physician_groupsAssociated TINs, manual additionsYes (PG plausibility)
Laboratoryspines_provider_laboratoriesnational_payer_coverage filter; Quest/LabCorp dedup by stateNo
Imaging Centerspines_provider_imaging_centersCan share codes with PGNo
Dialysis Centerspines_provider_dialysis_centersExcluded from lookback runsNo
DMEspines_provider_dmeExcluded from lookback runsYes (DME plausibility)
Urgent Carespines_provider_urgent_careExcluded from lookback runsNo