Intraspecies sequence comparisons for annotating genomes (original) (raw)
- Dario Boffelli1,2,
- Claire V. Weer1,2,
- Li Weng1,2,
- Keith D. Lewis1,2,
- Malak I. Shoukry1,2,
- Lior Pachter2,3,
- David N. Keys1,2, and
- Edward M. Rubin1,2,4
- 1 US Dept. of Energy Joint Genome Institute, Walnut Creek, California 94598, USA
- 2 Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- 3 Department of Mathematics, University of California, Berkeley, Berkeley, California 94720, USA
Abstract
Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intraspecies sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents, and a set of genomic intervals were amplified, resequenced, and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C. intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom. It also raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species.
Footnotes
[Supplemental material is available online at www.genome.org. The sequence data from this study were submitted to GenBank under accession nos. AY667278–AY667407. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: S. Fujiwara, A. Gittenberger, K. Heasman, H. Huelvan, D. Jiang, S. Kano, A. Phillippi, A. Sexton, and S. Shimeld.]
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3199704\. Article published online ahead of print in November 2004.
↵4 Corresponding author. E-mail emrubin{at}lbl.gov; fax (510) 486-4229.
- Accepted October 5, 2004.
- Received May 18, 2004.
Cold Spring Harbor Laboratory Press