The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: organization and implications for biotechnology and phylogenetic relationships amongst angiosperms - PubMed (original) (raw)

The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: organization and implications for biotechnology and phylogenetic relationships amongst angiosperms

Nalapalli Samson et al. Plant Biotechnol J. 2007 Mar.

Abstract

The chloroplast genome sequence of Coffea arabica L., the first sequenced member of the fourth largest family of angiosperms, Rubiaceae, is reported. The genome is 155 189 bp in length, including a pair of inverted repeats of 25,943 bp. Of the 130 genes present, 112 are distinct and 18 are duplicated in the inverted repeat. The coding region comprises 79 protein genes, 29 transfer RNA genes, four ribosomal RNA genes and 18 genes containing introns (three with three exons). Repeat analysis revealed five direct and three inverted repeats of 30 bp or longer with a sequence identity of 90% or more. Comparisons of the coffee chloroplast genome with sequenced genomes of the closely related family Solanaceae indicated that coffee has a portion of rps19 duplicated in the inverted repeat and an intact copy of infA. Furthermore, whole-genome comparisons identified large indels (> 500 bp) in several intergenic spacer regions and introns in the Solanaceae, including trnE (UUC)-trnT (GGU) spacer, ycf4-cemA spacer, trnI (GAU) intron and rrn5-trnR (ACG) spacer. Phylogenetic analyses based on the DNA sequences of 61 protein-coding genes for 35 taxa, performed using both maximum parsimony and maximum likelihood methods, strongly supported the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids, asterids, eurosids II, and euasterids I and II. Coffea (Rubiaceae, Gentianales) is only the second order sampled from the euasterid I clade. The availability of the complete chloroplast genome of coffee provides regulatory and intergenic spacer sequences for utilization in chloroplast genetic engineering to improve this important crop.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Circular gene map of the Coffea arabica chloroplast genome. The thick lines indicate the extent of the inverted repeats (IRa and IRb, 25 943 bp), which separate the genome into small (SSC, 18 133 bp) and large (LSC, 85 166 bp) single-copy regions. Genes on the outside of the map are transcribed in the clockwise direction and genes on the inside of the map are transcribed in the counterclockwise direction. *The rps19 gene locates entirely in the IRb region and partly in the IRa region. Arrows show the location of repeats (for more details on repeats, see Table 3).

Figure 2

Figure 2

The chloroplast genome comparison derived through a percentage identity plot of coffee against four Solanaceae members (Atropa belladonna, Solanum bulbocastanum, Nicotiana tabacum and Solanum lycopersicum) using the MultiPipMaker alignment tool. DNA losses are marked with roman numerals and the red boxes. I, region within intergenic spacer (IGS) [rps16_–_trnQ (UUG)]; II, region within IGS [trnE (UUC)–trnI (GGU)]; III, region within IGS (ycf4_–_cemA); IV, intron [IRb: trnI (GAU)]; V, intron [IRa: trnI (GAU)].

Figure 3

Figure 3

Maximum parsimony tree based on 61 chloroplast protein-coding genes (data are available at

http://www.biosci.utexas.edu/IB/faculty/jansen/lab/research/datafiles/index.htm

). The single most parsimonious phylogram has a length of 61 797, a consistency index of 0.41 (excluding uninformative characters) and a retention index of 0.58. Numbers above and below the nodes indicate the number of nucleotide substitutions and bootstrap support values, respectively.

Figure 4

Figure 4

Maximum likelihood tree based on 61 chloroplast protein-coding genes. The single maximum likelihood phylogram has a maximum likelihood value of −ln L = 348 679.23765. Numbers at the nodes indicate the bootstrap support values and the branch length scale is shown at the base of the tree.

Similar articles

Cited by

References

    1. Asano T, Tsudzuki T, Takahashi S, Shimada H, Kadowaki K. Nucleotide sequence of the sugarcane (Saccharum officinarum) chloroplast genome: a comparative analysis of four monocot chloroplast genomes. DNA Res. 2004;11:93–99. - PubMed
    1. Ashihara H, Crozier A. Caffeine: a well known but little mentioned compound in plant science. Trends Plant Sci. 2001;6:407–413. - PubMed
    1. Barton C, Adam TL, Zaarowitz MA. Stable transformation of foreign DNA into Coffea arabica plants. 14th International Conference on Coffee Science; San Francisco, CA, USA. Paris: ASIC (Association Scientifique Internationale du Café); 1991. pp. 460–464.
    1. Bausher MG, Singh ND, Lee S-B, Jansen RK, Daniell H. The complete chloroplast genome sequence of Citrus sinensis (L.) Osbeck var ‘Ridge Pineapple’: organization and phylogenetic relationships to other angiosperms. BMC Plant Biol. 2006;6:21. - PMC - PubMed
    1. Carneiro M. Coffee biotechnology and its application in genetic transformation. Euphytica. 1997;96:167–172.

Publication types

MeSH terms

Substances

LinkOut - more resources