
The human genome encodes roughly 19,500 canonical protein-coding genes, but growing evidence suggests thousands of additional small proteins, called microproteins, are translated from non-canonical open reading frames (ncORFs) that reference databases have not traditionally recognised. A key challenge has been establishing rigorous, standardised criteria to determine which of these microproteins warrant formal annotation as protein-coding genes, and how to classify those that fall short of that threshold.
The TransCODE Consortium, in collaboration with GENCODE, PeptideAtlas, Human Proteome Organization-Human Proteome Project (HUPO-HPP) and the HUPO-Human ImmunoPeptidome Project (HUPO-HIPP), set out to address this gap by developing a standardised annotation framework for ncORF-encoded microproteins. The researchers interrogated over 7,200 ncORFs using complementary approaches: large-scale conventional proteomics, HLA immunopeptidomics, ribosome sequencing, evolutionary constraint analysis, and CRISPR-based functional genomics.
Targeted peptide analysis via parallel reaction monitoring (PRM) with isotopically labelled synthetic peptides spiked into cultured cell lysate tryptic digests was used to validate the endogenous expression of ncORF-derived microproteins. The peptides were analyzed using an Aurora® Elite™ 15×75 XS C18 UHPLC column on a SCIEX ZenoTOF 8600 coupled to an ACQUITY UPLC M-Class with an OptiFlow Pro ion source.
Leveraging this multi-modal workflow, the researchers established that a subset of microproteins fulfil criteria for reclassification as canonical protein-coding genes, while introducing “peptidein” as a new formal classification for translated microproteins of currently indeterminate biological consequence. One peptidein, encoded within the OLMALINC transcript, was found to be essential for cancer cell viability, with roles implicating mitosis and DNA damage response.
This work establishes a replicable roadmap for expanding the annotated human proteome and opens new avenues for cancer immunotherapy and genetic disease research.
Publication
Nature
Authors
Eric W. Deutsch, Leron W. Kok, Jonathan M. Mudge, Cristian F. Valls, Irwin Jungreis, Jorge Ruiz-Orera, Zhi Sun, Ulrike Kusebauch, Ivo Fierro-Monti, Jennifer G. Abelin, M. Mar Alba, Julie L. Aspden, Sreejan Bandyopadhyay, Kaushik Banerjee, Pavel V. Baranov, Ariel A. Bazzini, Francis Bourassa, Elspeth A. Bruford, Lorenzo Calviello, Steven A. Carr, Anne-Ruxandra Carvunis, Sonia Chothani, Jim Clauwaert, Kellie Dean, Pouya Faridi, Adam Frankish, Amy Goodale, Thomas Green, Norbert Hubner, Nicholas T. Ingolia, Manolis Kellis, Michele Magrane, Maria Jesus Martin, Thomas F. Martinez, Gerben Menschaert, Uwe Ohler, Sandra Orchard, Alisa Potter, Owen J. L. Rackham, Matthew G. Rees, David E. Root, Jennifer A. Roth, Xavier Roucou, Fernando J. Sialana, Sarah A. Slavoff, Michał I. Świrski, Jack A. S. Tierney, Félix-Antoine Trifiro, Eivind Valen, Valeriia Vasylieva, Aaron Wacholder, Shengbo Wang, Li Wang, Jonathan S. Weissman, Wei Wu, Zhi Xie (谢志), Jyoti S. Choudhary, Michal Bassani-Sternberg, Juan Antonio Vizcaíno, Nicola Ternette, Marie A. Brunet, Robert L. Moritz, John R. Prensner & Sebastiaan van Heesch;
Title
Expanding the human proteome with microproteins and peptideins


