Annotation transfer for genomics: measuring functional divergence in multi-domain proteins - PubMed (original) (raw)
Distribution of multi-domain combinations amongst the genomes. The figure shows the occurrence of multi-domain fold combinations in a number of genomes, indicating its great variability. Each row indicates a particular combination of scop fold pairs (using scop 1.39), where a fold pair is defined as two distinct folds occurring in tandem in a protein. Each column represents a different genome, using the four-letter codes in the PartsList system (Qian et al. 2001): Aaeo, Aquifex aeolicus; Aful, Archaeoglobus fulgidus; Bbur, Borrelia burgdorferi; Bsub, Bacillus subtilis; Cele, Caenorhabditis elegans; Cpne, Chlamydia pneumoniae; Ctra, Chlamydia trachomatis; Ecol, Echerischia coli; Hinf, Haemophilus influenzae Rd; Hpyl, Helicobacter pylori; Mthe, Methanobacterium thermoautotrophicum; Mjan, Methanococcus jannaschii; Mtub, Mycobacterium tuberculosis; Mgen, Mycoplasma genitalium; Mpne, Mycoplasma pneumoniae; Phor, Pyrococcus horikoshii; Rpro, Rickettsia prowazekii; Scer, Saccharomyces cerevisiae; Syne, Synechocystis sp.; Tpal, Treponema pallidum. The numbers in each intersection cell indicate the number of times the fold pairs occur in a genome. Only the 20 most common fold pair combinations are shown here; the remainder are shown on the Web site (
). If a cell is greater than 6, it is shaded black; between 3 and 6, gray; and below 3, white. The blank spaces show instances in which one of the pairs does not occur in the organism at all (indicated by a value of -1 in the data table on the Web site). The fold assignments are done in a fashion consistent with those in PartsList and associated systems (Gerstein 1997; Lin et al. 2000; Drawid et al. 2001; Harrison et al. 2001; Qian et al. 2001).