Short open reading frames (sORFs) and microproteins: an update on their identification and validation measures (original) (raw)
A short open reading frame (sORFs) constitutes ≤ 300 bases, encoding a microprotein or sORF-encoded protein (SEP) which comprises ≤ 100 amino acids. Traditionally dismissed by genome annotation pipelines as meaningless noise, sORFs were found to possess coding potential with ribosome profiling (RIBO-Seq), which unveiled sORF-based transcripts at various genome locations. Nonetheless, the existence of corresponding microproteins that are stable and functional was little substantiated by experimental evidence initially. With recent advancements in multi-omics, the identification, validation, and functional characterisation of sORFs and microproteins have become feasible. In this review, we discuss the history and development of an emerging research field of sORFs and microproteins. In particular, we focus on an array of bioinformatics and OMICS approaches used for predicting, sequencing, validating, and characterizing these recently discovered entities. These strategies include RIBO-S...