A Biologist's View of the Drosophila Genome Annotation Assessment Project (original) (raw)

  1. Michael Ashburner1
  2. Department of Genetics, University of Cambridge, Cambridge, England; European Molecular Biology Laboratory (EMBL) European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK

Large-scale genomic sequencing projects are acts of faith—the faith that given a sequence of many millions of As, Ts, Gs, and Cs, we have the analytical tools to make sense of it. This faith can only be justified if the results of sequence analysis are tested continually against reality. The Genome Annotation Assessment Project (GASP) experiment, which took place in May, 1999, was one such test (see, in this issue, Birney and Durbin 2000;Gaasterland et al. 2000; Henikoff and Henikoff 2000; Krogh 2000; Ohler 2000; Parra et al. 2000; Reese et al. 2000a,b; Salamov and Solovyev 2000). GASP compared the interpretation of a 2.9-Mb sequence made by a mix of computation and human analysis done over a period of 2 years (Ashburner et al. 1999) with those done by wholly computational procedures carried out over a period of 6 weeks. Interestingly, GASP was an experiment within an experiment. The overt experiment was to assess the performance of a number of different analytical tools; the covert experiment was to assess how such an assessment could be done.

Computation Analysis in Conjunction with Experimental Analysis

The 2.9-Mb test sequence, determined by the Berkeley_Drosophila_ Genome Project (BDGP), is known as the Adh_region of the genome of Drosophila melanogaster, named after the gene encoding the enzyme alcohol dehydrogenase about which the region is centered. Ashburner and colleagues (e.g., Woodruff and Ashburner 1979a,b) had begun studying the Adh gene of D. melanogaster in the late 1970s; at the time it was the only gene in_Drosophila for which one could select both loss-of-function mutations (by their survival on unsaturated alcohols) and reversions to wild type (by their survival on ethanol). The work on the _Adh_gene gradually recruited genes in its neighborhood and, over a period of …