Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates - PubMed (original) (raw)
Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates
Zhengdong D Zhang et al. Genome Biol. 2010.
Abstract
Background: Unitary pseudogenes are a class of unprocessed pseudogenes without functioning counterparts in the genome. They constitute only a small fraction of annotated pseudogenes in the human genome. However, as they represent distinct functional losses over time, they shed light on the unique features of humans in primate evolution.
Results: We have developed a pipeline to detect human unitary pseudogenes through analyzing the global inventory of orthologs between the human genome and its mammalian relatives. We focus on gene losses along the human lineage after the divergence from rodents about 75 million years ago. In total, we identify 76 unitary pseudogenes, including previously annotated ones, and many novel ones. By comparing each of these to its functioning ortholog in other mammals, we can approximately date the creation of each unitary pseudogene (that is, the gene 'death date') and show that for our group of 76, the functional genes appear to be disabled at a fairly uniform rate throughout primate evolution - not all at once, correlated, for instance, with the 'Alu burst'. Furthermore, we identify 11 unitary pseudogenes that are polymorphic - that is, they have both nonfunctional and functional alleles currently segregating in the human population. Comparing them with their orthologs in other primates, we find that two of them are in fact pseudogenes in non-human primates, suggesting that they represent cases of a gene being resurrected in the human lineage.
Conclusions: This analysis of unitary pseudogenes provides insights into the evolutionary constraints faced by different organisms and the timescales of functional gene loss in humans.
Figures
Figure 1
Method for identifying human unitary pseudogenes in comparison to the mouse genome. (a) The overall methodological flowchart. The number of entries in the input/output data set used at certain steps is shown in parentheses. (b) Detailed inspection and synteny check of the potential human unitary pseudogenic loci. Entries in the initial set of pseudogenic loci are removed based on various criteria at different steps. The final result - the unitary pseudogenes and the polymorphic pseudogenes in human - are listed in Tables 1 and 2. See the main text for details. MGI, Mouse Genome Informatics. OR, olfactory receptor; VR, vomeronasal receptor; ZF, zinc finger protein.
Figure 2
The origin of human unitary pseudogenes in the paralogous gene sets. The human unitary pseudogenes with annotation from orthologous mouse genes are assigned to human paralogous gene sets, whose names are shown in the middle. The number of human unitary pseudogenes in each paralogous gene set and the number of members in each paralogous gene set are plotted as green and blue bars, respectively. Five unitary pseudogenes with uninformative annotation are denoted with question marks. Unitary pseudogenes without close paralogs are enclosed by dashed lines. The unitary pseudogenes from the tandem gene families are indicated by gray bars. Inset: box plot of the number of human unitary pseudogenes in each paralogous gene set and the number of members in each paralogous gene set.
Figure 3
The human-specific pseudogene of the major urinary protein. A G-to-A nucleotide substitution (with the reverse highlight) at the donor site of the second intron (delineated by the underlined splicing sites) abolishes the ORF of the coding sequence. The sequence conservation is clearly discernable from the multiple sequence alignment of polypeptide sequences translated from partial exonic sequences upstream and downstream of the splicing junction of MUP from 24 species.
Figure 4
Enrichment of Gene Ontology terms and Pfam domains in the human unitary pseudogene. Enriched GO terms and their positions in the hierarchy of (a) biological process and (b) molecular function terms. Yellow nodes correspond to significant GO terms. (c) _P_-values for significant GO terms and Pfam domains.
Figure 5
Dating the pseudogenization events. (a) Timing of the disruptive mutations that gave rise to human unitary pseudogenes by analyzing shared mutations. Only pseudogenes with annotations from orthologous mouse genes are shown. Ones without close paralogs are underlined. (b) Timing of several pseudogenization events that occurred in the human lineage after the human-chimp divergence. See Table S3 in Additional file 1 for the estimates and their standard errors. LCA, last common ancestor.
Figure 6
Population structure analysis for SNP rs4940595. (a) Hierarchical clustering of 11 populations using the _F_ST metric. Two subdivisions in the meta-population, as indicated by the dashed line, are clearly visible in the cluster. (b) Histogram of _F_ST from the permutation test using the population subdivisions as seen in (a).
Figure 7
Unitary pseudogene relativity. Given the phylogeny of human, chimpanzee, and mouse, a human unitary pseudogenes can arise from a gene loss that occurred in different lineages, including: (a) the human lineage after the human-chimp divergence; (b) the human-ancestral lineage after the human-mouse divergence but before the human-chimp divergence; and under different circumstances, such as (c) loss of a subfunctionalized gene in the human lineage after a duplication event before the human-chimp divergence. Because the absence of a functional gene in a species is only identifiable through the comparison with another species that has the functional ortholog, the human unitary pseudogene can be identified in (a) by comparing the human gene set to either the chimp or the mouse set as both of them have the human ortholog. In (b, c), however, the human unitary pseudogene can only be identified by comparing the human gene set to one of either the mouse or chimp gene set, as the other one does not have the human ortholog given the evolutionary history of the gene under consideration.
Figure 8
Polymorphic pseudogenes in human populations. (a) Human-specific pseudogenic polymorphism generated by gene inactivation. (b) Pseudogenic polymorphism since the last common ancestor. (c) Human-specific pseudogenic polymorphism generated by pseudogene resurrection.
Similar articles
- Comparative genomics search for losses of long-established genes on the human lineage.
Zhu J, Sanborn JZ, Diekhans M, Lowe CB, Pringle TH, Haussler D. Zhu J, et al. PLoS Comput Biol. 2007 Dec;3(12):e247. doi: 10.1371/journal.pcbi.0030247. PLoS Comput Biol. 2007. PMID: 18085818 Free PMC article. - Using pseudogene database to identify lineage-specific genes and pseudogenes in humans and chimpanzees.
Zhang Q. Zhang Q. J Hered. 2014 May-Jun;105(3):436-43. doi: 10.1093/jhered/est097. Epub 2014 Jan 7. J Hered. 2014. PMID: 24399747 - The olfactory receptor gene repertoire in primates and mouse: evidence for reduction of the functional fraction in primates.
Rouquier S, Blancher A, Giorgi D. Rouquier S, et al. Proc Natl Acad Sci U S A. 2000 Mar 14;97(6):2870-4. doi: 10.1073/pnas.040580197. Proc Natl Acad Sci U S A. 2000. PMID: 10706615 Free PMC article. - Comparative analysis of processed pseudogenes in the mouse and human genomes.
Zhang Z, Carriero N, Gerstein M. Zhang Z, et al. Trends Genet. 2004 Feb;20(2):62-7. doi: 10.1016/j.tig.2003.12.005. Trends Genet. 2004. PMID: 14746985 Review. - Pseudogenes and their composers: delving in the 'debris' of human genome.
Sen K, Ghosh TC. Sen K, et al. Brief Funct Genomics. 2013 Nov;12(6):536-47. doi: 10.1093/bfgp/elt026. Epub 2013 Jul 29. Brief Funct Genomics. 2013. PMID: 23900003 Review.
Cited by
- Circadian period is compensated for repressor protein turnover rates in single cells.
Gabriel CH, Del Olmo M, Rizki Widini A, Roshanbin R, Woyde J, Hamza E, Gutu NN, Zehtabian A, Ewers H, Granada A, Herzel H, Kramer A. Gabriel CH, et al. Proc Natl Acad Sci U S A. 2024 Aug 20;121(34):e2404738121. doi: 10.1073/pnas.2404738121. Epub 2024 Aug 14. Proc Natl Acad Sci U S A. 2024. PMID: 39141353 - Illuminating the function of the orphan transporter, SLC22A10, in humans and other primates.
Yee SW, Ferrández-Peral L, Alentorn-Moron P, Fontsere C, Ceylan M, Koleske ML, Handin N, Artegoitia VM, Lara G, Chien HC, Zhou X, Dainat J, Zalevsky A, Sali A, Brand CM, Wolfreys FD, Yang J, Gestwicki JE, Capra JA, Artursson P, Newman JW, Marquès-Bonet T, Giacomini KM. Yee SW, et al. Nat Commun. 2024 May 23;15(1):4380. doi: 10.1038/s41467-024-48569-7. Nat Commun. 2024. PMID: 38782905 Free PMC article. - Spatiotemporal transcriptomic changes of human ovarian aging and the regulatory role of FOXP1.
Wu M, Tang W, Chen Y, Xue L, Dai J, Li Y, Zhu X, Wu C, Xiong J, Zhang J, Wu T, Zhou S, Chen D, Sun C, Yu J, Li H, Guo Y, Huang Y, Zhu Q, Wei S, Zhou Z, Wu M, Li Y, Xiang T, Qiao H, Wang S. Wu M, et al. Nat Aging. 2024 Apr;4(4):527-545. doi: 10.1038/s43587-024-00607-1. Epub 2024 Apr 9. Nat Aging. 2024. PMID: 38594460 Free PMC article. - Illuminating the Function of the Orphan Transporter, SLC22A10 in Humans and Other Primates.
Yee SW, Ferrández-Peral L, Alentorn P, Fontsere C, Ceylan M, Koleske ML, Handin N, Artegoitia VM, Lara G, Chien HC, Zhou X, Dainat J, Zalevsky A, Sali A, Brand CM, Capra JA, Artursson P, Newman JW, Marques-Bonet T, Giacomini KM. Yee SW, et al. Res Sq [Preprint]. 2023 Sep 14:rs.3.rs-3263845. doi: 10.21203/rs.3.rs-3263845/v1. Res Sq. 2023. PMID: 37790518 Free PMC article. Updated. Preprint. - Pseudogenes in Cancer: State of the Art.
Nakamura-García AK, Espinal-Enríquez J. Nakamura-García AK, et al. Cancers (Basel). 2023 Aug 8;15(16):4024. doi: 10.3390/cancers15164024. Cancers (Basel). 2023. PMID: 37627052 Free PMC article. Review.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources