The CORE CONCEPT
Codebreaker’s platform generates tens of thousands of proprietary data points for every genetic variant, creating a deep phenotypic fingerprint of how that variant shapes overall cell function.
Using our proprietary algorithms, we mathematically compare these high-dimensional profiles to quantify how closely each variant’s cellular behavior matches a healthy baseline or diverges toward pathogenicity. Known pathogenic variants (e.g., G13R) show minimal overlap with the healthy baseline, while benign variants (e.g., A66A) overlap extensively.
Most importantly, for variants of unknown or ambiguous significance (e.g., A130V or A146T), this analysis provides the functional, quantitative evidence needed to assign biological meaning—turning uncertainty into actionable answers.
The result is a continuously expanding Codex of Variant Profiles: a proprietary dataset and knowledge base that compounds in value with every new variant mapped. This level of causal insight is only possible at Codebreaker Labs.
Our breakthrough is that we have scaled this method to allow for mapping of 10,000’s of variants in parallel with further scaling strategies in development. Critically, we have done so while retaining the efficiencies required to enable the use of the most powerful fingerprinting assays, including single-cell omic technologies. This scaled method provides the ground-truth data that is foundational to the future of biological AI.
Codex of Variants
(3 examples of What we are building)
-
KRAS
The KRAS gene encodes a small GTPase that acts as a molecular switch—cycling between inactive and active states—and relays signals downstream of receptor tyrosine kinases to regulate cell growth and survival. Over a few hundred KRAS variants are cataloged in public databases, many implicated in cancers such as lung, colorectal, and pancreatic. Two FDA-approved targeted therapies—sotorasib and adagrasib—specifically inhibit the KRAS G12C mutation. In our Codex of Variant Profiles, we have functionally mapped [X] KRAS variants, compared to roughly ~337 variants currently annotated in ClinVar.
-
TP53
The TP53 gene encodes the p53 protein, a critical tumor suppressor that monitors genomic integrity and triggers cell cycle arrest, DNA repair, or apoptosis in response to cellular stress. Close to 1,000 pathogenic TP53 variants have been described in the literature, many associated with cancers such as breast, ovarian, brain tumors, and sarcomas—and with clinical relevance in Li-Fraumeni syndrome. While no FDA-approved therapies yet target TP53 mutations directly, TP53 remains a major focus in precision oncology research.
In our Codex of Variant Profiles, we have functionally mapped [X] TP53 variants, compared to approximately 2700 TP53 variants currently annotated in ClinVar.
-
ABL1
The ABL1 gene encodes a non-receptor tyrosine kinase that regulates cell differentiation, division, and adhesion. It is best known for the BCR-ABL1 fusion, which drives chronic myeloid leukemia (CML). Hundreds of ABL1 mutations and fusions are implicated across hematologic malignancies, and targeted tyrosine kinase inhibitors such as imatinib, dasatinib, and ponatinib carry labeling for BCR-ABL1 alterations.
In our Codex of Variant Profiles, we have functionally mapped [X] ABL1 variants, compared to approximately 556 variants currently annotated in ClinVar (across all significance categories).