Codon Usage Bias and Determining Forces in Taenia solium Genome - PubMed (original) (raw)

Codon Usage Bias and Determining Forces in Taenia solium Genome

Xing Yang et al. Korean J Parasitol. 2015 Dec.

Abstract

The tapeworm Taenia solium is an important human zoonotic parasite that causes great economic loss and also endangers public health. At present, an effective vaccine that will prevent infection and chemotherapy without any side effect remains to be developed. In this study, codon usage patterns in the T. solium genome were examined through 8,484 protein-coding genes. Neutrality analysis showed that T. solium had a narrow GC distribution, and a significant correlation was observed between GC12 and GC3. Examination of an NC (ENC vs GC3s)-plot showed a few genes on or close to the expected curve, but the majority of points with low-ENC (the effective number of codons) values were detected below the expected curve, suggesting that mutational bias plays a major role in shaping codon usage. The Parity Rule 2 plot (PR2) analysis showed that GC and AT were not used proportionally. We also identified 26 optimal codons in the T. solium genome, all of which ended with either a G or C residue. These optimal codons in the T. solium genome are likely consistent with tRNAs that are highly expressed in the cell, suggesting that mutational and translational selection forces are probably driving factors of codon usage bias in the T. solium genome.

Keywords: Taenia solium; codon usage bias; intron number; mutation bias; translation selection.

PubMed Disclaimer

Conflict of interest statement

We have no conflict of interest related to this work.

Figures

Fig. 1.

Fig. 1.

The distribution of GC contents in T. solium genes. The GC content of the 8,484 T. solium genes (shown in blue) is unimodal.

Fig. 2.

Fig. 2.

Correspondence analysis of relative synonymous codon usage (RSCU) for all 8,484 T. solium genes. (A) This panel shows the distribution of genes on the primary and secondary axes (accounting for 16.7% and 14.1% of the total variation, respectively). The 2 classes of genes (High GC and Low GC) are color coded; the high GC genes are shown in red and the low GC genes are shown in blue. (B) This panel shows the underlying distribution of codons on the same 2 axes. Codons ending with G or C are shown in red, and codons ending with A or U are shown in green.

Fig. 3.

Fig. 3.

GC12 or ENC vs GC3s plot for 8,484 genes of T. solium. (A) GC12 vs GC3 plot (Neutrality plot analyses). The regression line is y=0.5513x+0.2196, R2=0.7309, OP=0.4894. The range of the GC3 values was 12.0%-93.4%. The cross point of the regression line and the diagonal line is defined as the optimum point (OP). (B) ENC versus GC3s plot (NC plot), the solid red line indicates the expected ENC. The ENC values of different genes ranged 21.5 to 61.0; values of GC3s ranged 10.8 to 93.3.

Fig. 4.

Fig. 4.

Parity rule (PR2)-bias plot. The red open circle indicates the average position x=0.4340±0.0887 and y=0.4458±0.0927.

Fig. 5.

Fig. 5.

Plot of ENC versus hydropathicity index and aromaticity score for T. solium genes.

Fig. 6.

Fig. 6.

Plot of ENC versus protein length for T. solium.

Fig. 7.

Fig. 7.

Plot of ENC versus intron number for T. solium.

References

    1. Ikemura T. Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol. 1985;2:13–34. - PubMed
    1. Ikemura T. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol. 1981;151:389–409. - PubMed
    1. Nakamura Y, Gojobori T, Ikemura T. Codon usage tabulated from the international DNA sequence databases. Nucleic Acids Res. 1997;25:244–245. - PMC - PubMed
    1. Sharp PM, Li WH. An evolutionary perspective on synonymous codon usage in unicellular organisms. J Mol Evol. 1986;24:28–38. - PubMed
    1. Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci USA. 1999;96:4482–4487. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources