Codon Usage Bias and Determining Forces in Taenia solium Genome - PubMed (original) (raw)
Codon Usage Bias and Determining Forces in Taenia solium Genome
Xing Yang et al. Korean J Parasitol. 2015 Dec.
Abstract
The tapeworm Taenia solium is an important human zoonotic parasite that causes great economic loss and also endangers public health. At present, an effective vaccine that will prevent infection and chemotherapy without any side effect remains to be developed. In this study, codon usage patterns in the T. solium genome were examined through 8,484 protein-coding genes. Neutrality analysis showed that T. solium had a narrow GC distribution, and a significant correlation was observed between GC12 and GC3. Examination of an NC (ENC vs GC3s)-plot showed a few genes on or close to the expected curve, but the majority of points with low-ENC (the effective number of codons) values were detected below the expected curve, suggesting that mutational bias plays a major role in shaping codon usage. The Parity Rule 2 plot (PR2) analysis showed that GC and AT were not used proportionally. We also identified 26 optimal codons in the T. solium genome, all of which ended with either a G or C residue. These optimal codons in the T. solium genome are likely consistent with tRNAs that are highly expressed in the cell, suggesting that mutational and translational selection forces are probably driving factors of codon usage bias in the T. solium genome.
Keywords: Taenia solium; codon usage bias; intron number; mutation bias; translation selection.
Conflict of interest statement
We have no conflict of interest related to this work.
Figures
Fig. 1.
The distribution of GC contents in T. solium genes. The GC content of the 8,484 T. solium genes (shown in blue) is unimodal.
Fig. 2.
Correspondence analysis of relative synonymous codon usage (RSCU) for all 8,484 T. solium genes. (A) This panel shows the distribution of genes on the primary and secondary axes (accounting for 16.7% and 14.1% of the total variation, respectively). The 2 classes of genes (High GC and Low GC) are color coded; the high GC genes are shown in red and the low GC genes are shown in blue. (B) This panel shows the underlying distribution of codons on the same 2 axes. Codons ending with G or C are shown in red, and codons ending with A or U are shown in green.
Fig. 3.
GC12 or ENC vs GC3s plot for 8,484 genes of T. solium. (A) GC12 vs GC3 plot (Neutrality plot analyses). The regression line is y=0.5513x+0.2196, R2=0.7309, OP=0.4894. The range of the GC3 values was 12.0%-93.4%. The cross point of the regression line and the diagonal line is defined as the optimum point (OP). (B) ENC versus GC3s plot (NC plot), the solid red line indicates the expected ENC. The ENC values of different genes ranged 21.5 to 61.0; values of GC3s ranged 10.8 to 93.3.
Fig. 4.
Parity rule (PR2)-bias plot. The red open circle indicates the average position x=0.4340±0.0887 and y=0.4458±0.0927.
Fig. 5.
Plot of ENC versus hydropathicity index and aromaticity score for T. solium genes.
Fig. 6.
Plot of ENC versus protein length for T. solium.
Fig. 7.
Plot of ENC versus intron number for T. solium.
References
- Ikemura T. Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol. 1985;2:13–34. - PubMed
- Ikemura T. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol. 1981;151:389–409. - PubMed
- Sharp PM, Li WH. An evolutionary perspective on synonymous codon usage in unicellular organisms. J Mol Evol. 1986;24:28–38. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous