Enigmatic Translons – Animalcules of Molecular Biology?
Ribosome profiling has revealed the extensive utilization of multiple start codons in eukaryotic mRNAs, including in humans. Translation initiation at these start codons results in the synthesis of proteoforms with alternative N-termini and also in translated regions (translons) not annotated as protein coding. Some of such enigmatic translons encode microproteins and immunopeptides while many function as critical elements of translation control that sense cellular environment. This regulatory function of translons is often independent of the polypeptide products encoded by them.
The emerging picture of RNA translation is in striking contrast with the current data structures for annotating protein coding genes in which a sequence of an RNA molecule could have only a single protein-coding sequence (CDS). To address this limitation, we developed a new abstract representation of mRNA translation termed Ribosome Decision Graphs (RDGs). RDGs allow for a biologically realistic representation of eukaryotic translation complexity, focusing on locations critical for translation regulation. RDGs can be used to depict the mutual organization of translons reflecting the interplay between their translation. RDGs can be used in the analysis of ribosome profiling data to identify critical regulatory events. They also can be used for the interpretation of genetic variants, enabling a mechanistic understanding of pathogenic variants occurring outside of CDS regions.