Network Biology and Pathway Convergence
STRING PPI network, GenePT gene embeddings, SynLethDB synthetic lethality, PrimeKG knowledge graph, and DISCOVER mutual exclusivity.
Convergence Nodes
156
PrimeKG: all 5 genes within 1-2 hops (129,262 nodes, 8,100,498 edges)
Genes Connected Complete
5/5
All patient genes found in PrimeKG knowledge graph
Pathway Redundancy Rejected
REJECTED
GenePT: IDH2-SETBP1 only 6.6th percentile similarity
SL Pairs
5/10
SynLethDB: predicted or possible synthetic lethal gene pairs
STRING Protein-Protein Interaction Network
The STRING v12 network
44
[44] D 2023
The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res (2023)
reveals strong functional connections
between patient genes. DNMT3A and EZH2 share the highest
combined score (0.999), consistent with their shared epigenetic function.
Of the 5 patient genes, 3 have direct high-confidence interactions
(score ≥ 0.7): DNMT3A-EZH2, DNMT3A-IDH2, and EZH2-IDH2.
SETBP1 and PTPN11 connect through the broader myeloid gene network
(via ASXL1, FLT3, and other myeloid regulators) rather than direct PPI evidence.
| Gene A | Gene B | Combined Score | Text-mining | Experimental | In Patient? |
|---|---|---|---|---|---|
| DNMT3A | EZH2 | 0.999 | 0.998 | 0.519 | Both |
| DNMT3A | DNMT3L | 0.999 | 0.984 | 0.986 | One |
| U2AF1 | SRSF2 | 0.999 | 0.997 | 0.726 | Neither |
| EZH2 | ASXL1 | 0.998 | 0.980 | 0.314 | One |
| FRS3 | PTPN11 | 0.993 | 0.983 | 0.270 | One |
| PTPN11 | MPIG6B | 0.982 | 0.923 | 0.292 | One |
| ASXL1 | TET2 | 0.975 | 0.975 | 0.000 | Neither |
| RUNX1 | PTPN11 | 0.960 | 0.611 | 0.000 | One |
| MPZL1 | PTPN11 | 0.954 | 0.880 | 0.632 | One |
| NRAS | PTPN11 | 0.952 | 0.892 | 0.153 | One |
| EZH2 | EPOP | 0.947 | 0.642 | 0.786 | One |
| DNMT3A | TET2 | 0.946 | 0.926 | 0.292 | One |
| ZRSR2 | SRSF2 | 0.943 | 0.727 | 0.000 | Neither |
| PILRB | PTPN11 | 0.941 | 0.440 | 0.000 | One |
| DNMT3A | ASXL1 | 0.939 | 0.936 | 0.091 | One |
Table 1.Top 15 protein-protein interactions in the expanded network. Patient genes bolded.
Source: STRING v12 (Szklarczyk et al., 2023).
Required score 700 (high confidence). 5 query genes + 20 additional interactors.
GenePT Functional Similarity
Pathway redundancy hypothesis: REJECTED. If the five
driver mutations were functionally redundant, we would expect high pairwise
embedding similarity (consistent co-occurrence would reflect backup pathway
activation). Instead, GenePT-derived cosine similarities show that most
pairs cluster at low percentiles:
IDH2-SETBP1 at only the 6.6th percentile,
IDH2-PTPN11 at the 3.9th percentile, and
DNMT3A-SETBP1 at the 2.3rd percentile. Only
DNMT3A-EZH2 (98.6th percentile) shows high similarity,
which is expected given their shared epigenetic function. The mean similarity
of co-occurring pairs (0.390)
exceeds exclusive pairs (0.249),
but the dominant pattern is functional diversity, not redundancy.
| Gene Pair | Cosine Similarity | Percentile | Class | Co-occurrence |
|---|---|---|---|---|
| DNMT3A / EZH2 | 0.6511 | 98.6th | High | unknown |
| SETBP1 / PTPN11 | 0.5958 | 95.9th | High | Co-occurring |
| DNMT3A / IDH2 | 0.4623 | 77.4th | High | Co-occurring |
| IDH2 / EZH2 | 0.4342 | 67.4th | Medium | unknown |
| PTPN11 / EZH2 | 0.3436 | 29.6th | Medium | unknown |
| SETBP1 / EZH2 | 0.3289 | 24.8th | Low | unknown |
| DNMT3A / PTPN11 | 0.2764 | 10.5th | Low | Co-occurring |
| IDH2 / SETBP1 | 0.2492 | 6.6th | Low | Exclusive |
| IDH2 / PTPN11 | 0.2241 | 3.9th | Low | Co-occurring |
| DNMT3A / SETBP1 | 0.2040 | 2.3th | Low | neutral |
Table 2.Pairwise functional similarity for all 10 patient gene pairs. Percentile ranks calculated against all 561 gene pairs from the 34-gene target panel.
Source: GenePT-inspired embeddings (Chen & Zou, 2024).
Sentence-transformer model: all-MiniLM-L6-v2 (384 dimensions).
Distribution: mean=0.398,
SD=0.103.
Synthetic Lethality Analysis
5 of 10 gene pairs show predicted or possible
synthetic lethal interactions. The strongest signals:
DNMT3A + EZH2 (dual epigenetic repression loss),
EZH2 + PTPN11 (PRC2 loss with RAS hyperactivation), and
IDH2 + SETBP1 (supported by GENIE mutual exclusivity,
O/E = 0.13 for IDH1+SETBP1). The IDH2-SETBP1 SL signal is notable because
it may explain why the patient's IDH2 subclone remains at very low VAF (2%):
negative selection from the dominant SETBP1 clone (VAF 34%). DNMT3A + IDH2
is explicitly NOT synthetic lethal; this is a well-established cooperative pair
12
[12] E 2016
Genomic Classification and Prognosis in Acute Myeloid Leukemia. N Engl J Med (2016)
.
| Gene Pair | SL Status | Evidence | Direction | Therapeutic Angle |
|---|---|---|---|---|
| DNMT3A / EZH2 | Predicted SL | Medium | both loss-of-function in patient | EZH2 inhibitors (tazemetostat) may be selectively toxic to DNMT3A-mutant cells. However, in this patient EZH2 is already... |
| DNMT3A / IDH2 | Not SL | High | DNMT3A loss + IDH2 gain cooperate (NOT SL) | Enasidenib (IDH2 inhibitor) + azacitidine (targets remaining DNMT1) is the standard combination. Venetoclax adds BCL2 in... |
| DNMT3A / PTPN11 | Weak SL | Low | DNMT3A loss + PTPN11 gain (different pathways) | SHP2 inhibitors (TNO155, RMC-4550) + azacitidine combination may exploit both vulnerabilities. No direct SL-based strate... |
| DNMT3A / SETBP1 | Unknown | Low | DNMT3A loss + SETBP1 gain (uncharacterized pair) | No SL-based therapeutic strategy available for this pair. |
| EZH2 / IDH2 | Context-dependent | Medium | EZH2 loss + IDH2 gain (opposing epigenetic effects) | Enasidenib (IDH2i) would remove the compensatory H3K27me3 maintenance, potentially lethal in EZH2-mutant cells. This cou... |
| EZH2 / PTPN11 | Predicted SL | Medium | EZH2 loss + PTPN11 gain (PRC2 + RAS crosstalk) | MEK inhibitors (trametinib) or SHP2 inhibitors may be selectively effective in EZH2-mutant + PTPN11-mutant cells. SWI/SN... |
| EZH2 / SETBP1 | Unknown | Low | EZH2 loss + SETBP1 gain (no published data) | No SL-based therapeutic strategy available for this pair. |
| IDH2 / PTPN11 | Not SL | Medium | IDH2 gain + PTPN11 gain (both activating, may cooperate) | Combination of enasidenib (IDH2i) + SHP2 inhibitor targets both pathways. No SL-based rationale, but dual pathway inhibi... |
| IDH2 / SETBP1 | Possible SL | Low | IDH2 gain + SETBP1 gain (mutual exclusivity observed) | The observed mutual exclusivity between IDH and SETBP1 is one of the strongest findings in the co-occurrence analysis. I... |
| PTPN11 / SETBP1 | Not SL | Medium | PTPN11 gain + SETBP1 gain (both drive proliferation) | SHP2 inhibitors would target PTPN11 directly. PP2A activators (e.g., FTY720/fingolimod) could counteract SETBP1-mediated... |
Table 3.Synthetic lethality assessment for all 10 pairwise combinations of patient genes.
Source: SynLethDB 2.0 (synlethdb.sist.shanghaitech.edu.cn).
Supplemented with DepMap CRISPR screen data and published literature.
DISCOVER Mutual Exclusivity
DISCOVER
30
[30] S 2016
A novel independence test for somatic alterations in cancer shows that biology drives mutual exclusivity but chance explains most co-occurrence. Genome Biol (2016)
controls for gene-level mutation rates
AND sample-level mutation burden when testing mutual exclusivity, using a
Poisson-Binomial model across 18,625 myeloid
samples. After permutation correction (n=10,000), IDH2-SETBP1 is NOT
mutually exclusive (p = 0.9999). The apparent mutual exclusivity from
Fisher's exact test (O/E = 0.905) disappears once background mutation rates are
properly controlled. All 10 pairs show DISCOVER p-values near 1.0, indicating
no pair is mutually exclusive beyond what mutation rate differences explain.
The strongest co-occurrence is DNMT3A-IDH2 (Z = 46.63, Fisher O/E = 2.74).
| Gene Pair | Observed | Fisher O/E | Fisher p-value | DISCOVER Z | Direction |
|---|---|---|---|---|---|
| DNMT3A / IDH2 | 380 | 2.741 | 2.46e-87 | 46.63 | Co-occurring |
| DNMT3A / SETBP1 | 62 | 1.060 | 0.6184 | 8.16 | Co-occurring |
| DNMT3A / PTPN11 | 139 | 1.953 | 1.60e-15 | 21.02 | Co-occurring |
| DNMT3A / EZH2 | 105 | 1.098 | 0.2876 | 11.10 | Co-occurring |
| IDH2 / SETBP1 | 20 | 0.905 | 0.7402 | 4.02 | Exclusive |
| IDH2 / PTPN11 | 41 | 1.439 | 0.0199 | 9.63 | Co-occurring |
| IDH2 / EZH2 | 59 | 1.598 | 3.78e-4 | 12.30 | Co-occurring |
| SETBP1 / PTPN11 | 41 | 3.623 | 1.14e-12 | 16.89 | Co-occurring |
| SETBP1 / EZH2 | 75 | 4.957 | 2.95e-31 | 27.70 | Co-occurring |
| PTPN11 / EZH2 | 54 | 2.908 | 1.32e-12 | 16.92 | Co-occurring |
Table 4.DISCOVER pairwise mutual exclusivity test results. Fisher O/E and p-values shown alongside DISCOVER Z-scores.
Source: GENIE v19.0 (18,625 samples).
DISCOVER method (Canisius et al., Genome Biology, 2016).
Hypermutation threshold: 40 coding mutations.
PrimeKG Knowledge Graph
Graph Size
129,262 nodes
8,100,498 edges across 10 relation types
Convergence Nodes
156
Nodes connected to all 5 patient genes within 1-2 hops
Strongest Overlap
SETBP1-IDH2
Overlap coefficient 0.787 (highest pairwise)
PrimeKG (Chandak et al., 2023) integrates 20 biomedical data sources into a
knowledge graph with 129,262 nodes and 8,100,498 edges. All
5 patient genes are represented, with 156 shared convergence
nodes connected to all 5. The pairwise neighborhood overlap ranges from
Jaccard 0.16 (DNMT3A-PTPN11, lowest) to 0.37 (DNMT3A-SETBP1, highest).
The overlap coefficient (normalized by smaller neighborhood) reveals that
SETBP1-IDH2 (0.787) and SETBP1-PTPN11 (0.760) share the largest fraction
of their neighborhoods, consistent with their convergence on myeloid disease nodes.
| Gene Pair | Shared Neighbors | Jaccard Similarity | Overlap Coefficient |
|---|---|---|---|
| DNMT3A / IDH2 | 114 | 0.2721 | 0.4302 |
| DNMT3A / SETBP1 | 121 | 0.3700 | 0.6612 |
| DNMT3A / PTPN11 | 114 | 0.1603 | 0.4302 |
| DNMT3A / EZH2 | 107 | 0.1583 | 0.4038 |
| IDH2 / SETBP1 | 139 | 0.4455 | 0.7596 |
| IDH2 / PTPN11 | 168 | 0.2545 | 0.6269 |
| IDH2 / EZH2 | 76 | 0.1070 | 0.2836 |
| SETBP1 / PTPN11 | 144 | 0.2404 | 0.7869 |
| SETBP1 / EZH2 | 81 | 0.1306 | 0.4426 |
| PTPN11 / EZH2 | 88 | 0.0889 | 0.1699 |
Table 5.Pairwise neighborhood overlap in PrimeKG for all 10 patient gene pairs.
Source: PrimeKG (Chandak, Huang, Zitnik. Scientific Data 2023).
129,262 nodes, 8,100,498 edges across gene/protein, disease,
pathway, molecular function, biological process, and anatomy node types.
References
- Szklarczyk D et al. The STRING database in 2023: protein-protein association networks. Nucleic Acids Res (2023). DOI
- Chen H, Zou J. GenePT: A Simple But Effective Foundation Model for Genes and Cells Built on ChatGPT. bioRxiv (2024). DOI
- Canisius S et al. A novel independence test for somatic alterations in cancer. Genome Biol (2016). DOI
- Chandak P, Huang K, Zitnik M. Building a knowledge graph to enable precision medicine. Sci Data (2023). DOI
- Papaemmanuil E et al. Genomic classification and prognosis in acute myeloid leukemia. N Engl J Med (2016). PubMed