A Survey Sequence Comparison of Saccharum Genotypes Reveals Allelic Diversity Differences (original) (raw)

The sugarcane genome sequencing effort: an overview of the strategy, goals and existing data

2010

Sugarcane is a major feedstock used for biofuel production worldwide. Sugarcane cultivars (Saccharum spp) are derived from interspecific hybridization obtained a century ago by crossing Saccharum officinarum (2n=8x=80) and S. spontaneum (2n=5x=40 to 2n=16x=128). The challenge in a sugarcane genome sequencing project is the size (10 Gb) and complexity of its genome structure that is highly polyploid and aneuploid (2n=

The Challenge of Analyzing the Sugarcane Genome

Frontiers in plant science, 2018

Reference genome sequences have become key platforms for genetics and breeding of the major crop species. Sugarcane is probably the largest crop produced in the world (in weight of crop harvested) but lacks a reference genome sequence. Sugarcane has one of the most complex genomes in crop plants due to the extreme level of polyploidy. The genome of modern sugarcane hybrids includes sub-genomes from two progenitors and with some chromosomes resulting from recombination between these sub-genomes. Advancing DNA sequencing technologies and strategies for genome assembly are making the sugarcane genome more tractable. Advances in long read sequencing have allowed the generation of a more complete set of sugarcane gene transcripts. This is supporting transcript profiling in genetic research. The progenitor genomes are being sequenced. A monoploid coverage of the hybrid genome has been obtained by sequencing BAC clones that cover the gene space of the closely related sorghum genome. The co...

A mosaic monoploid reference sequence for the highly complex genome of sugarcane

Nature communications, 2018

Sugarcane (Saccharum spp.) is a major crop for sugar and bioenergy production. Its highly polyploid, aneuploid, heterozygous, and interspecific genome poses major challenges for producing a reference sequence. We exploited colinearity with sorghum to produce a BAC-based monoploid genome sequence of sugarcane. A minimum tiling path of 4660 sugarcane BAC that best covers the gene-rich part of the sorghum genome was selected based on whole-genome profiling, sequenced, and assembled in a 382-Mb single tiling path of a high-quality sequence. A total of 25,316 protein-coding gene models are predicted, 17% of which display no colinearity with their sorghum orthologs. We show that the two species, S. officinarum and S. spontaneum, involved in modern cultivars differ by their transposable elements and by a few large chromosomal rearrangements, explaining their distinct genome size and distinct basic chromosome numbers while also suggesting that polyploidization arose in both lineages after t...

Analysis of three sugarcane homo/homeologous regions suggests independent polyploidization events of Saccharum officinarum and Saccharum spontaneum

Genome biology and evolution, 2017

Whole genome duplication has played an important role in plant evolution and diversification. Sugarcane is an important crop with a complex hybrid polyploid genome, for which the process of adaptation to polyploidy is still poorly understood. In order to improve our knowledge about sugarcane genome evolution and the homo/homeologous gene expression balance, we sequenced and analyzed 27 BACs (Bacterial Artificial Chromosome) of sugarcane R570 cultivar, containing the putative single-copy genes LFY (seven haplotypes), PHYC (four haplotypes) and TOR (seven haplotypes). Comparative genomic approaches showed that these sugarcane loci presented a high degree of conservation of gene content and collinearity (synteny) with sorghum and rice orthologous regions, but were invaded by transposable elements (TE). All the homo/homeologous haplotypes of LFY, PHYC and TOR are likely to be functional, since they are all under purifying selection (dN/dS ≪ 1). However, they were found to participate in...

Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L

Nature genetics, 2018

Modern sugarcanes are polyploid interspecific hybrids, combining high sugar content from Saccharum officinarum with hardiness, disease resistance and ratooning of Saccharum spontaneum. Sequencing of a haploid S. spontaneum, AP85-441, facilitated the assembly of 32 pseudo-chromosomes comprising 8 homologous groups of 4 members each, bearing 35,525 genes with alleles defined. The reduction of basic chromosome number from 10 to 8 in S. spontaneum was caused by fissions of 2 ancestral chromosomes followed by translocations to 4 chromosomes. Surprisingly, 80% of nucleotide binding site-encoding genes associated with disease resistance are located in 4 rearranged chromosomes and 51% of those in rearranged regions. Resequencing of 64 S. spontaneum genomes identified balancing selection in rearranged regions, maintaining their diversity. Introgressed S. spontaneum chromosomes in modern sugarcanes are randomly distributed in AP85-441 genome, indicating random recombination among homologs in ...

Target enrichment sequencing of 307 germplasm accessions identified ancestry of ancient and modern hybrids and signatures of adaptation and selection in sugarcane (Saccharum spp.), a "sweet" crop with "bitter" genomes

Plant biotechnology journal, 2018

Sugarcane (Saccharum spp.) is a highly energy-efficient crop primarily for sugar and bio-ethanol production. Sugarcane genetics and cultivar improvement have been extremely challenging largely due to its complex genomes with high polyploidy levels. In this study, we deeply sequenced the coding regions of 307 sugarcane germplasm accessions. Nearly five million sequence variations were cataloged. The average of 98x sequence depth enabled different allele dosages of sequence variation to be differentiated in this polyploid collection. With selected high-quality genome-wide SNPs, we performed population genomic studies and environmental association analysis. Results illustrated that the ancient sugarcane hybrids, S. barberi and S. sinense, and modern sugarcane hybrids are significantly different in terms of genomic compositions, hybridization processes, and their potential ancestry contributors. Linkage disequilibrium (LD) analysis showed a large extent of LD in sugarcane, with 962.4 Kb...

A draft chromosome-scale genome assembly of a commercial sugarcane

Scientific Reports, 2022

Sugarcane accounts for a large portion of the worlds sugar production. Modern commercial cultivars are complex hybrids of S. officinarum, S. spontaneum, and several other Saccharum species, resulting in an auto-allopolyploid with 8-12 copies of each chromosome. The current genome assembly gold standard is to generate a long read assembly followed by chromatin conformation capture sequencing to scaffold. We used the PacBio RSII and chromatin conformation capture sequencing to sequence and assemble the genome of a South East Asian commercial sugarcane cultivar, known as Khon Kaen 3. The Khon Kaen 3 genome assembled into 104,477 contigs totalling 7 Gb, which scaffolded into 56 pseudochromosomes containing 5.2 Gb of sequence. Genome annotation produced 242,406 genes from 30,927 orthogroups. Aligning the Khon Kaen 3 genome sequence to S. officinarum and S. spontaneum revealed a high level of apparent recombination, indicating a chimeric assembly. This assembly error is explained by high nucleotide identity between S. officinarum and S. spontaneum, where 91.8% of S. spontaneum aligns to S. officinarum at 94% identity. Thus, the subgenomes of commercial sugarcane are so similar that using short reads to correct long PacBio reads produced chimeric long reads. Future attempts to sequence sugarcane must take this information into account. Sugarcane is an important crop species and is the major source of processed sugar in the world. The name sugarcane does not refer to a single species, but rather refers to any of several species in the genus Saccharum. Taxonomic classification of these species is difficult, confounded by many years of cultivation and cross breeding, which makes phenotypic classification unreliable, and has been a topic of much debate. It was originally considered that six species exist: S. spontaneum, S. robustum, S. officinarum, S. sinense, S. edule and S. barberi 1. Although, now it is common for S. edule to be excluded from this genus because in situ hybridisation revealed it to be a likely hybrid between S. officinarum and S. robustum, leaving only five species 1. Only two of these species, S. robustum and S. spontaneum, are considered wild species, and the remaining species are all cultivated 2,3. Modern commercial cultivars of sugarcane are complex hybrids of S. officinarum as the maternal donor crossed with S. spontaneum and, to a lesser extent, some other species and hybrids (for review see 1,4,5). Sugarcane is believed to have originated in the South Pacific, but was widely dispersed by early explorers making it difficult to pinpoint the exact origin. It is believed that S. spontaneum originates from India, but can be found growing wild from eastern and northern Africa, through the Middle East, to India, China, South East Asia, and through the Pacific to New Guinea. The other wild species, S. robustum, is indigenous to New Guinea and can be found along river banks. It is considered that S. officinarum also most likely originated in New Guinea and was likely derived from S. robustum. Modern commercial cultivars have complex polyploid genomes as a result of many generations of hybridisation.

Genome-wide association study of multiple yield components in a diversity panel of polyploid sugarcane (Saccharum spp.)

Sugarcane (Saccharum spp.) is an important economic crop, contributes up to 80% of sugar and approximately 60% bio-fuel globally. To meet the increased demand for sugar and bio-fuel supplies, it is critical to breed sugarcane cultivars with robust performance in yield components. Therefore, dissection of causal DNA sequence variants is of great importance by providing genetic resources and fundamental information for crop improvement. In this study, we evaluated and analyzed nine yield components in a sugarcane diversity panel consisting of 308 accessions primarily selected from the “world collection of sugarcane and related grasses”. By genotyping the diversity panel using target enrichment sequencing, we identified a large number of sequence variants. Genome-wide association study between the markers and traits were conducted with dosages and gene actions taken into consideration. In total, 217 non-redundant markers and 225 candidate genes were identified to be significantly assoc...

Genetic variability among the chloroplast genomes of sugarcane (Saccharum spp) and its wild progenitor species Saccharum spontaneum L

Genetics and Molecular Research, 2014

A striking characteristic of modern sugarcane is that all sugarcane cultivars (Saccharum spp) share a common cytoplasm from S. officinarum. To explore the potential value of S. spontaneum cytoplasm, new Saccharum hybrids with an S. spontaneum cytoplasm were developed at the United States Department of Agriculture-Agricultural Research Service, Sugarcane Research Laboratory, through a combination of conventional and molecular breeding approaches. In this study, we analyzed the genetic variability among the chloroplast genomes of four sugarcane cultivars, eight S. spontaneum clones, and three F 1 progeny containing an S. spontaneum cytoplasm. Based on the complete chloroplast J.-R. Zhu et al. ©FUNPEC-RP www.funpecrp.com.br Genetics and Molecular Research 13 (2): 3037-3047 (2014) genome sequence information of two sugarcane cultivars (NCo 310 and SP 80-3280) and five related grass species (barley, maize, rice, sorghum, and wheat), 19 polymerase chain reaction primer pairs were designed targeting various chloroplast DNA (cpDNA) segments with a total length varying from 4781 to 4791 bp. Ten of the 19 cpDNA segments were polymorphic, harboring 14 mutation sites [a 15-nt insertion/deletion (indel), a 5-nt indel, two poly (T) tracts, and 10 single nucleotide polymorphisms]. We demonstrate for the first time that the chloroplast genome of S. spontaneum was maternally inherited. Comparative sequence homology analyses clustered sugarcane cultivars into a distinctive group away from S. spontaneum and its progeny. Three mutation sites with a consistent, yet species-specific, nucleotide composition were found, namely, an A/C transversion and two indels. The genetic variability among cpDNA of sugarcane cultivars and S. spontaneum will be useful information to determine the maternal origin in the Saccharum genus.

Unraveling the Genome of a High Yielding Colombian Sugarcane Hybrid

Frontiers in Plant Science

Recent developments in High Throughput Sequencing (HTS) technologies and bioinformatics, including improved read lengths and genome assemblers allow the reconstruction of complex genomes with unprecedented quality and contiguity. Sugarcane has one of the most complicated genomes among grassess with a haploid length of 1Gbp and a ploidies between 8 and 12. In this work, we present a genome assembly of the Colombian sugarcane hybrid CC 01-1940. Three types of sequencing technologies were combined for this assembly: PacBio long reads, Illumina paired short reads, and Hi-C reads. We achieved a median contig length of 34.94 Mbp and a total genome assembly of 903.2 Mbp. We annotated a total of 63,724 protein coding genes and performed a reconstruction and comparative analysis of the sucrose metabolism pathway. Nucleotide evolution measurements between orthologs with close species suggest that divergence between Saccharum officinarum and Saccharum spontaneum occurred <2 million years ag...