The Repeat Pattern Toolkit (RPT): analyzing the structure and evolution of the C. elegans genome - PubMed (original) (raw)
Affiliations
- PMID: 7584377
The Repeat Pattern Toolkit (RPT): analyzing the structure and evolution of the C. elegans genome
P Agarwal et al. Proc Int Conf Intell Syst Mol Biol. 1994.
Abstract
Over 3.6 million bases of DNA sequence from chromosome III of the C. elegans have been determined. The availability of this extended region of contiguous sequence has allowed us to analyze the nature and prevalence of repetitive sequences in the genome of a eukaryotic organism with a high gene density. We have assembled a Repeat Pattern Toolkit (RPT) to analyze the patterns of repeats occurring in DNA. The tools include identifying significant local alignments (utilizing both two-way and three-way alignments), dividing the set of alignments into connected components (signifying repeat families), computing evolutionary distance between repeat family members, constructing minimum spanning trees from the connected components, and visualizing the evolution of the repeat families. Over 7000 families of repetitive sequences were identified. The size of the families ranged from isolated pairs to over 1600 segments of similar sequence. Approximately 12.3% of the analyzed sequence participates in a repeat element.
Similar articles
- Identifying repeat domains in large genomes.
Zhi D, Raphael BJ, Price AL, Tang H, Pevzner PA. Zhi D, et al. Genome Biol. 2006;7(1):R7. doi: 10.1186/gb-2006-7-1-r7. Epub 2006 Jan 31. Genome Biol. 2006. PMID: 16507140 Free PMC article. - Repetitive-DNA elements are similarly distributed on Caenorhabditis elegans autosomes.
Surzycki SA, Belknap WR. Surzycki SA, et al. Proc Natl Acad Sci U S A. 2000 Jan 4;97(1):245-9. doi: 10.1073/pnas.97.1.245. Proc Natl Acad Sci U S A. 2000. PMID: 10618403 Free PMC article. - 2.2 Mb of contiguous nucleotide sequence from chromosome III of C. elegans.
Wilson R, Ainscough R, Anderson K, Baynes C, Berks M, Bonfield J, Burton J, Connell M, Copsey T, Cooper J, et al. Wilson R, et al. Nature. 1994 Mar 3;368(6466):32-8. doi: 10.1038/368032a0. Nature. 1994. PMID: 7906398 - High evolutionary turnover of satellite families in Caenorhabditis.
Subirana JA, Albà MM, Messeguer X. Subirana JA, et al. BMC Evol Biol. 2015 Oct 5;15:218. doi: 10.1186/s12862-015-0495-x. BMC Evol Biol. 2015. PMID: 26438045 Free PMC article. - Sequence structure of hidden 10.4-base repeat in the nucleosomes of C. elegans.
Salih F, Salih B, Trifonov EN. Salih F, et al. J Biomol Struct Dyn. 2008 Dec;26(3):273-82. doi: 10.1080/07391102.2008.10531241. J Biomol Struct Dyn. 2008. PMID: 18808193
Cited by
- Automated de novo identification of repeat sequence families in sequenced genomes.
Bao Z, Eddy SR. Bao Z, et al. Genome Res. 2002 Aug;12(8):1269-76. doi: 10.1101/gr.88502. Genome Res. 2002. PMID: 12176934 Free PMC article. - Identification of transposable elements using multiple alignments of related genomes.
Caspi A, Pachter L. Caspi A, et al. Genome Res. 2006 Feb;16(2):260-70. doi: 10.1101/gr.4361206. Epub 2005 Dec 14. Genome Res. 2006. PMID: 16354754 Free PMC article. - A clustering method for repeat analysis in DNA sequences.
Volfovsky N, Haas BJ, Salzberg SL. Volfovsky N, et al. Genome Biol. 2001;2(8):RESEARCH0027. doi: 10.1186/gb-2001-2-8-research0027. Epub 2001 Aug 1. Genome Biol. 2001. PMID: 11532211 Free PMC article. - Retrotransposons in Plant Genomes: Structure, Identification, and Classification through Bioinformatics and Machine Learning.
Orozco-Arias S, Isaza G, Guyot R. Orozco-Arias S, et al. Int J Mol Sci. 2019 Aug 6;20(15):3837. doi: 10.3390/ijms20153837. Int J Mol Sci. 2019. PMID: 31390781 Free PMC article. Review. - De novo repeat classification and fragment assembly.
Pevzner PA, Tang H, Tesler G. Pevzner PA, et al. Genome Res. 2004 Sep;14(9):1786-96. doi: 10.1101/gr.2395204. Genome Res. 2004. PMID: 15342561 Free PMC article.