Code Crosswalks
Introductionâ
This document outlines our approach for constructing a description-based crosswalk between MS-DRGs and APR-DRGs
Stats / Coverage
- Mapped MS-DRGs:
633(83% total MS-DRGs) - Mapped base APR-DRGs (pre-SOI):
258(82% total base APR-DRGs) - Mapped APR-DRGs (post-SOI):
1024(77% total post-SOI APR-DRGs) - Count of MS-DRGs that map to 1, 2, 3, 4, etc. APR-DRGs (pre-SOI)
- e.g. in the table below, 413 MS-DRGs map to 1 pre-SOI APR-DRG
- e.g. 154 MS-DRGs map to 2 pre-SOI APR-DRGs
Num. APRDRGs Mapped to Total MS-DRGs 1 413 2 154 3 45 4 14 5 6 6 1 - Count of APR-DRGs (pre-SOI) that map to 1, 2, 3, 4, etc. MS-DRGs
- e.g. in the table below, 29 pre-SOI APR-DRGs map to 1 MS-DRG
Num. MS-DRGs Mapped to Total APR-DRGs 1 30 2 51 3 104 4 21 5 22 6 19 8 6 9 1 11 3 12 1
Known Gaps / Areas for Improvement
- Newborn Services -
- APR-DRGs classify neonates based on age (under 29 days), regardless of principal diagnosis. In contrast, MS-DRGs rely on the principal diagnosis to place patients in neonatal categories. Because some âneonatalâ diagnoses can also apply to older patients, itâs difficult to match APR-DRGs (age-driven) to MS-DRGs (diagnosis-driven). Weâve tagged these services as ones to revisit and re-assign manually
- Orthopedic Services -
- Given different structures of APR-DRGs / MS-DRGs for MSK services (body parts, complexity, elective vs. non-elective), weâve bookmarked these services as next-up for a more in-depth manual review and assignment of crosswalk items
- Psych Services -
- Weâve noticed that our methodology below falls short due to psych services that can have similar meanings (but not use a similar language to describe them).
Crosswalk Logic
-
Gather Source Data Files
- Utilize the most recent official descriptors for both MS-DRGs and APR-DRGs
- Ensure each file passes data integrity checks (e.g., consistent row counts, valid code formats, matching disclaimers to official CMS/3M lists).
-
Create an embedding based crosswalk
- Generate Embeddings for all MS-DRGs and APR-DRGs
- We use an AI model (OpenAIâs âtext-embedding-ada-002â) that reads each DRG description and turns it into a vectorâa long list of numbers that captures the essential meaning of the text.
- For example, if two descriptions are very similar in meaning (e.g., âCraniotomy with major comorbiditiesâ vs. âCraniotomy procedure with significant secondary conditionsâ), their embeddings will look similar.
- To mitigate purely âgeneric textâ interpretations, we supplement embedding generation with domain-specific references (e.g., synonyms, common abbreviations, and expansions) to help the model more accurately capture nuanced clinical language.
- We use an AI model (OpenAIâs âtext-embedding-ada-002â) that reads each DRG description and turns it into a vectorâa long list of numbers that captures the essential meaning of the text.
- Correlate DRG Embeddings with Cosine Similarity
- Cosine similarity allows us to measure the distance between two sets of embeddings. A smaller distance means that two embeddings (and, in effect, two DRG descriptions) are close in meaning. A larger distance means that they are farther in meaning.
- Utilizing a cosine similarity function, we add up to 10 MS-DRG <> APR-DRG relationships to the embedding-based crosswalk that are above our minimum distance criteria.
- In practice, almost all MS-DRGs correspond to fewer than 10 APR-DRGs; however, we allow up to 10 in the raw output to capture edge cases or borderline mappings that might require manual or secondary validation. Subsequent steps (described below) trim this list significantly.
- Generate Embeddings for all MS-DRGs and APR-DRGs
-
Create Validation Thresholds in Crosswalk Base
The initial embedding-based crosswalk is intentionally broad. This step implements two additional filters that refine it to create a âfinal crosswalk"
- Overlapping Words
- We compare the number of matching or closely related words in each MS-DRG and APR-DRG description. This helps identify pairs where the language aligns closely beyond the purely numeric similarity score (and helps us eliminate many records where the numeric cosine similarity gives a false positive)
- Synonyms, Abbreviations and Noise: To ensure fair comparisons, we maintain a controlled vocabulary that handles common clinical abbreviations and synonymous terms (e.g., âseverâ vs. âmajorâ). We also maintain a list of ânoisyâ words that have no contextual meaning and shouldnât count as matches (e.g. of, and, etc.)
- Grab record(s) that have the most correlated words together. If none, we grab record(s) that have the highest cosine similarity score present in the embedding-based crosswalk
- MGB-Specific Validation & Correlation
- Utilizing fy22 MGB claims, we create an MGB-reference crosswalk to evaluate the accuracy of our embedding based crosswalk. We create this reference crosswalk by:
- Sum the total number of claims dual-coded to each MS-DRG and APR-DRG combo (utilizing APR-DRG v40)
- Limit the dataset to â„ 20 claims per MS-DRG: Each MS-DRG must be dual-coded with an APR-DRG at least 20 times in order to be eligible
- Limit the dataset to â„15% MS-DRG <> APR-DRG combo: If an MS-DRG has at least 20 claims, we only include APR-DRG mappings that are billed across 15% or more of total claims billed for that MSDRG.
- Records in the embedding crosswalk that are validated by this reference crosswalk are flagged. Embedding crosswalk mappings that are not validated by the the reference crosswalk are manually reviewed for accuracy
- Utilizing fy22 MGB claims, we create an MGB-reference crosswalk to evaluate the accuracy of our embedding based crosswalk. We create this reference crosswalk by:
- Overlapping Words
-
DRG Family Alignment
- An MS-DRG family refers to DRGs sharing the same core clinical intent but varying by severity (e.g., âwith CC,â âwith MCC,â or âno CC/MCCâ).
- Each MS-DRG that belongs to the same family must be mapped to the same set of APR-DRGs
-
Manual Review
- Crosswalk entries passing all automated checks undergo manual review by internal team members (coders, billers and medical auditors) who validate that both DRG terms are applicable to the same set of procedures
-
Map Severity of Illnesses
- Typically, an MS-DRG with âno CC/MCCâ is closer to a lower Severity of Illness (SOI=1), while an MS-DRG with âMCCâ is closer to a higher SOI (up to SOI=4)
- Our initial mapping objective maps the following MS-DRG types to the following APR-DRG SOIs:
- MS-DRG âNo CC/MCCâ â APR-DRG with SOI = 1
- MS-DRG âwith CCâ â APR-DRG with SOI = 2 or 3 (depending on clinical alignment)
- MS-DRG âwith MCCâ â APR-DRG with SOI = 4
- Note - not all MS-DRG families are created equal. They can contain different combinations of MDC assignments. Generally our SOI assignment follows logic above, but our current methodology prioritizes a full range of crosswalk items; meaning each pre-MDC DRG family should map to all SOIs of an APR-DRG