De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweet potato (Ipomoea batatas) - PubMed (original) (raw)

De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweet potato (Ipomoea batatas)

Zhangying Wang et al. BMC Genomics. 2010.

Abstract

Background: The tuberous root of sweet potato is an important agricultural and biological organ. There are not sufficient transcriptomic and genomic data in public databases for understanding of the molecular mechanism underlying the tuberous root formation and development. Thus, high throughput transcriptome sequencing is needed to generate enormous transcript sequences from sweet potato root for gene discovery and molecular marker development.

Results: In this study, more than 59 million sequencing reads were generated using Illumina paired-end sequencing technology. De novo assembly yielded 56,516 unigenes with an average length of 581 bp. Based on sequence similarity search with known proteins, a total of 35,051 (62.02%) genes were identified. Out of these annotated unigenes, 5,046 and 11,983 unigenes were assigned to gene ontology and clusters of orthologous group, respectively. Searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) indicated that 17,598 (31.14%) unigenes were mapped to 124 KEGG pathways, and 11,056 were assigned to metabolic pathways, which were well represented by carbohydrate metabolism and biosynthesis of secondary metabolite. In addition, 4,114 cDNA SSRs (cSSRs) were identified as potential molecular markers in our unigenes. One hundred pairs of PCR primers were designed and used for validation of the amplification and assessment of the polymorphism in genomic DNA pools. The result revealed that 92 primer pairs were successfully amplified in initial screening tests.

Conclusion: This study generated a substantial fraction of sweet potato transcript sequences, which can be used to discover novel genes associated with tuberous root formation and development and will also make it possible to construct high density microarrays for further characterization of gene expression profiles during these processes. Thousands of cSSR markers identified in the present study can enrich molecular markers and will facilitate marker-assisted selection in sweet potato breeding. Overall, these sequences and markers will provide valuable resources for the sweet potato community. Additionally, these results also suggested that transcriptome analysis based on Illumina paired-end sequencing is a powerful tool for gene discovery and molecular marker development for non-model species, especially those with large and complex genome.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Gap distribution of assembled scaffolds and unigenes. Gap distribution (N/size) %: gap percentage (N amount/sequence length) distribution.

Figure 2

Figure 2

Assessment of assembly quality. Distribution of unique-mapped reads of the assembled unigenes.

Figure 3

Figure 3

Comparison of I. batatas unigenes to orthologous A. thaliana coding sequences. (A) The ratio of I. batatas unigene length to A. thaliana ortholog length was plotted against I. batatas unigene coverage depth. (B) Total percent of A. thaliana ortholog coding sequence that was covered by all I. batatas unigenes. In total, 502 orthologs could be covered by unigenes with a percentage more than 80%, and the cover percentage of around 5,000 orthologs ranged from 50-80%. Additionally, 27% orthologs were covered with only 20% or lower.

Figure 4

Figure 4

Comparison of unigene length between hit and no-hit unigenes. Longer contigs were more likely to have BLAST matches in protein databases. In this study, 79% of unigenes over 500 bp in length had BLAST matches, whereas only 30% of unigenes shorter than 300 bp did.

Figure 5

Figure 5

Characteristics of similarity search of unigenes against Nr and Swiss-Prot databases. (A) E-value distribution of BLAST hits for each unigene with a cutoff E-value of 1.0E-5 in Nr database. (B) E-value distribution of BLAST hits for each unigene with a cutoff E-value of 1.0E-5 in Swiss-Prot database. (C) Similarity distribution of the top BLAST hits for each unigene in Nr database. (D) Similarity distribution of the top BLAST hits for each unigenes in Swiss-Prot dababase.

Figure 6

Figure 6

Gene Ontology classification of assembled unigenes. The results are summarized in three main categories: Biological process, Cellular component and Molecular function. In total, 5,046 unigenes with BLAST matches to known proteins were assigned to gene ontology.

Figure 7

Figure 7

Histogram presentation of clusters of orthologous groups (COG) classification. All unigenes were aligned to COG database to predict and classify possible functions. Out of 27,435 Nr hits, 11,983 sequences were assigned to 25 COG classifications.

Figure 8

Figure 8

Pathway assignment based on KEGG. (A) Classification based on metabolism categories; (B) Classification based on secondary metabolite categories.

Figure 9

Figure 9

Frequency distribution of cSSRs based on motif sequence types. Within the searched cSSRs, a total of 160 motif sequence types were identified, of which, di-, tri-, tetra-, penta- and hexa-nucleotide repeat existed 4, 10, 30, 57 and 59 types, respectively. The AG/CT di-nucleotide repeat motif was the most abundant motif detected in our cSSRs.

Similar articles

Cited by

References

    1. The Food and Agriculture Organization. http://faostat.fao.org/
    1. Hahn SK. A quantitative approach to source potentials and sink capacities among reciprocal grafts of sweet potato varieties. Crop Sci. 1977;17:559–562. doi: 10.2135/cropsci1977.0011183X001700040020x. - DOI
    1. Tanaka M, Takahata Y, Nakatani M. Analysis of genes developmentally regulated during storage root formation of sweet potato. J Plant Physiol. 2005;162:91–102. doi: 10.1016/j.jplph.2004.06.003. - DOI - PubMed
    1. Buteler MI, Jarret RL, Labonte DR. Sequence characterization of microsatellites in diploid and polyploid Ipomoea. Theor Appl Genet. 1999;99:123–132. doi: 10.1007/s001220051216. - DOI
    1. Hu JJ, Nakatani M, Mizuno K, Fujimura T. Development and Characterization of Microsatellite Markers in Sweetpotato. Breeding Science. 2004;54:177–188. doi: 10.1270/jsbbs.54.177. - DOI

Publication types

MeSH terms

Substances

LinkOut - more resources