Methodology
We build off the analysis in Employer Aetna POS Rate Comparison, generalizing the approach to compare any two networks.
Methodology​
- Sample 100,000 rates that appear in both networks for the same provider-code combination. Filter to NULL modifiers and HCPCS/MS-DRG codes only.
- For each provider-code combination, compute the max rate for network A and the max rate for network B.
- Compute the absolute and percent difference between the max rates for network A and network B.
- Report the 10th, 20th, ..., 90th, 95th, and 99th percentiles of the absolute and percent differences.
- Store results in two tables containing pairwise comparisons:
tq_dev.internal_dev_csong_sandbox.network_similarity_analysis- contains overall results
tq_dev.internal_dev_csong_sandbox.network_similarity_analysis_by_class_and_code_type- contains results grouped by billing class and billing code type
- Use python to generate matrix view of the results.
Limitations
Nothing major, but some things to consider:
- We use a completely random sample. Revenue-weighted sampling may be more appropriate. It's possible that the rate differences occur where rates don't matter, or othat differences occur most where rates matter most.
- We only compare max rates. Some networks may have more variability within provider-network-code than others.
- This doesn't consider coverage. One network may cover more providers than another.
Matrix View​
| Aetna | OPEN ACCESS MANAGED CHOICE | OPEN ACCESS ELECT CHOICE | NATIONAL PPO | NATIONAL EPO |
|---|---|---|---|---|
| NATIONAL EPO | 1 | 1.0 | 0.95 | 1 |
| NATIONAL PPO | 0.95 | 0.90 | 1 | |
| OPEN ACCESS ELECT CHOICE | 1 | 1 | ||
| OPEN ACCESS MANAGED CHOICE | 1 |
Read as: "National EPO" and "OAMC" are 100% similar, while "National EPO" and "National PPO" are 95% similar.
View Python Code
# %%
df = pd.read_sql(f"""
SELECT
payer_id_a as payer_id,
product_network_label_a as network_a,
product_network_label_b as network_b,
CASE
WHEN max(max_diff_pct_p99) < 0.05 THEN '1.0'
WHEN max(max_diff_pct_p95) < 0.05 THEN '0.95'
WHEN max(max_diff_pct_p90) < 0.05 THEN '0.90'
WHEN max(max_diff_pct_p80) < 0.10 THEN '0.80'
WHEN max(max_diff_pct_p70) < 0.10 THEN '0.70'
WHEN max(max_diff_pct_p60) < 0.10 THEN '0.60'
WHEN max(max_diff_pct_p50) < 0.10 THEN '0.50'
WHEN max(max_diff_pct_p40) < 0.10 THEN '0.40'
WHEN max(max_diff_pct_p30) < 0.10 THEN '0.30'
WHEN max(max_diff_pct_p20) < 0.10 THEN '0.20'
WHEN max(max_diff_pct_p10) < 0.10 THEN '0.10'
ELSE '0'
END as similarity,
max(max_diff_pct_p90) as max_diff_pct_p90,
max(max_diff_pct_p95) as max_diff_pct_p95,
max(max_diff_pct_p99) as max_diff_pct_p99
FROM tq_dev.internal_dev_csong_sandbox.network_similarity_analysis
WHERE data_source = 'core_rates'
GROUP BY 1, 2, 3
""", con=trino_conn)
df
# %% [markdown]
"""
Comparison Matrices
"""
# %%
bucket_to_val = {"1.0": 1, "0.95": 0.95, "0.90": 0.9, "0.80": 0.8, "0.70": 0.7, "0.60": 0.6, "0.50": 0.5, "0.40": 0.4, "0.30": 0.3, "0.20": 0.2, "0.10": 0.1, "0": 0}
df["similarity_val"] = df["similarity"].map(bucket_to_val)
matrices = {}
for payer_id, g in df.groupby("payer_id", dropna=False):
nets = sorted(set(g["network_a"].dropna().unique()).union(set(g["network_b"].dropna().unique())))
if not nets:
continue
mat = g.pivot_table(
index="network_a",
columns="network_b",
values="similarity",
aggfunc="first"
)
# reverse column order for upper-left triangle
mat = mat.reindex(index=nets, columns=list(reversed(nets)))
# fill diagonal with 1.0
for n in nets:
mat.loc[n, n] = "1"
matrices[payer_id] = mat.astype(float)
print(f"\n### Payer {payer_id} — Network Similarity Matrix\n")
print(mat.fillna("").to_markdown())
Output Pairwise Comparison Tables​
View SQL Query
SELECT
data_source,
payer_id_a,
payer_id_b,
product_network_label_a,
product_network_label_b,
schema_month,
max_diff_p10,
max_diff_p20,
max_diff_p30,
max_diff_p40,
max_diff_p50,
max_diff_p60,
max_diff_p70,
max_diff_p80,
max_diff_p90,
max_diff_p95,
max_diff_p99,
max_diff_pct_p10,
max_diff_pct_p20,
max_diff_pct_p30,
max_diff_pct_p40,
max_diff_pct_p50,
max_diff_pct_p60,
max_diff_pct_p70,
max_diff_pct_p80,
max_diff_pct_p90,
max_diff_pct_p95,
max_diff_pct_p99,
run_timestamp
FROM tq_dev.internal_dev_csong_sandbox.network_similarity_analysis
WHERE data_source = 'core_rates'
-- OR use core_rates_physician_groups
-- WHERE data_source = 'core_rates'
LIMIT 5
| data_source | payer_id_a | payer_id_b | product_network_label_a | product_network_label_b | schema_month | max_diff_p10 | max_diff_p20 | max_diff_p30 | max_diff_p40 | max_diff_p50 | max_diff_p60 | max_diff_p70 | max_diff_p80 | max_diff_p90 | max_diff_p95 | max_diff_p99 | max_diff_pct_p10 | max_diff_pct_p20 | max_diff_pct_p30 | max_diff_pct_p40 | max_diff_pct_p50 | max_diff_pct_p60 | max_diff_pct_p70 | max_diff_pct_p80 | max_diff_pct_p90 | max_diff_pct_p95 | max_diff_pct_p99 | run_timestamp |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| core_rates | 7 | None | NATIONAL EPO | OPEN ACCESS MANAGED CHOICE | 2025_07 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2025-09-12 08:36:22 |
| core_rates | 7 | None | NATIONAL EPO | NATIONAL PPO | 2025_07 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.14153 | 2858.55 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.0463063 | 0.889578 | 2025-09-12 08:37:29 |
| core_rates | 7 | None | NATIONAL EPO | OPEN ACCESS ELECT CHOICE | 2025_07 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2025-09-12 08:35:10 |
| core_rates | 7 | None | NATIONAL PPO | OPEN ACCESS ELECT CHOICE | 2025_07 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 7.39286 | 2841.29 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.0557815 | 1.04107 | 2025-09-12 08:34:15 |
| core_rates | 7 | None | OPEN ACCESS ELECT CHOICE | OPEN ACCESS MANAGED CHOICE | 2025_07 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2025-09-12 08:32:15 |
code
Visit network_similarity.py