More than 10 million cancer mutations in TCGA have no known functional effect. Rather than characterizing each one experimentally, this dataset uses transcription factor activity as a proxy — comparing expression patterns in tumors with uncharacterized mutations against those with known gain or loss of function variants to infer function at scale.
Across TCGA cohorts, 577,866 mutational events were annotated this way, including neomorphic mutations and those that phenocopy other variants. Predictions were validated by introducing 37 breast cancer PIK3CA mutations into MCF10A cells via lentiviral knock-in and profiling by PLATE-Seq: 15 of 15 gain/loss of function classifications confirmed, 15 of 20 neomorphic classifications confirmed.
For patients carrying variants of unknown significance in established oncoproteins, these annotations could inform targeted therapy selection.