Phytozome: a comparative platform for green plant genomics - PubMed (original) (raw)

. 2012 Jan;40(Database issue):D1178-86.

doi: 10.1093/nar/gkr944. Epub 2011 Nov 22.

Affiliations

Phytozome: a comparative platform for green plant genomics

David M Goodstein et al. Nucleic Acids Res. 2012 Jan.

Abstract

The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Default view of the Gene Family page for a 17 member core eudicot family. Members are listed according to their order in the tree on the Phytozome home page. The membership class of each gene is indicated in the leftmost column (

Supplementary Methods S2

). For each member, Gene Page and GBrowse links are provided, as well as links to external databases (if these exist), aliases, symbols and deflines. The synteny view in the right column shows the five upstream and five downstream neighbors of each family member (who are rendered as gray icons in the middle of each synteny row). Each syntenic segment is oriented to render family members in the same orientation (consistent with their presumed descent from a common ancestor). Gene icons sharing the same (non-white) color are all members of the same gene family at this node; this can provide syntenic support for the hypothesis of a common ancestor for family members.

Figure 2.

Figure 2.

Family History view of the gene family in Figure 1. All the descendants and ancestors of this core eudicot family (which is highlighted) are visible in the history view. The strict nesting of families is observed, though one needs to remember that one of the Eucalyptus genes in this core eudicot family is an incomplete pledge (

Supplementary Methods S2

), and is not present in the deeper Embryophyte and Viridiplantae ancestors.

Figure 3.

Figure 3.

GBrowse view of the local genomic context of the poplar gene from the family in Figure 1. Primary and alternative transcripts (if present), assembled EST data and related plant peptides are shown aligned against the genome. Not shown are tracks of repetitive regions, GC content and the alignment of ESTs from related species. Interspecies whole genome alignments, displayed in the VISTA tracks, reveal the tendency towards strong genomic sequence conservation in coding regions (which are under selective pressure), which weakens as one considers more distantly related species (e.g. rice-poplar versus the more closely related eucalyptus-poplar VISTA alignments). Displayed gene models are hyperlinked to their respective gene pages.

Figure 4.

Figure 4.

Default view of the Gene Page for the Arabidopsis thaliana gene in the family of Figure 1, showing primary transcript info, functional annotations and simplified genomic context. This locus has an alternative transcript (which appears to differ primarily in its 5′-UTR). Note the strong splicing support provided by the BLATX aligned Arabidopsis lyrata peptide (which in actuality is also a member of this family).

Similar articles

Cited by

References

    1. Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008;36:D1009–D1014. - PMC - PubMed
    1. Liang C, Jaiswal P, Hebbard C, Avraham S, Buckler ES, Casstevens T, Hurwitz B, McCouch S, Ni J, Pujar A, et al. Gramene: a growing plant comparative genomics resource. Nucleic Acids Res. 2008;36:D947–D953. - PMC - PubMed
    1. Bombarely A, Menda N, Tecle IY, Buels RM, Strickler S, Fischer-York T, Pujar A, Leto J, Gosselin J, Mueller LA. The sol genomics network (solgenomics.net): growing tomatoes using Perl. Nucleic Acids Res. 2011;39:D1149–D1155. - PMC - PubMed
    1. Jung S, Staton M, Lee T, Blenda A, Svancara R, Abbott A, Main D. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data. Nucleic Acids Res. 2008;36:D1034–D1040. - PMC - PubMed
    1. Gonzales MD, Archuleta E, Farmer A, Gajendran K, Grant D, Shoemaker R, Beavis WD, Waugh ME. The Legume Information System (LIS): an integrated information resource for comparative legume biology. Nucleic Acids Res. 2005;33:D660–D665. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources