Comparison of DNA sequences with protein sequences - PubMed (original) (raw)
. 1997 Nov 15;46(1):24-36.
doi: 10.1006/geno.1997.4995.
Affiliations
- PMID: 9403055
- DOI: 10.1006/geno.1997.4995
Comparison of DNA sequences with protein sequences
W R Pearson et al. Genomics. 1997.
Abstract
The FASTA package of sequence comparison programs has been expanded to include FASTX and FASTY, which compare a DNA sequence to a protein sequence database, translating the DNA sequence in three frames and aligning the translated DNA sequence to each sequence in the protein database, allowing gaps and frameshifts. Also new are TFASTX and TFASTY, which compare a protein sequence to a DNA sequence database, translating each sequence in the DNA database in six frames and scoring alignments with gaps and frameshifts. FASTX and TFASTX allow only frameshifts between codons, while FASTY and TFASTY allow substitutions or frameshifts within a codon. We examined the performance of FASTX and FASTY using different gap-opening, gap-extension, frameshift, and nucleotide substitution penalties. In general, FASTX and FASTY perform equivalently when query sequences contain 0-10% errors. We also evaluated the statistical estimates reported by FASTX and FASTY. These estimates are quite accurate, except when an out-of-frame translation produces a low-complexity protein sequence. We used FASTX to scan the Mycoplasma genitalium, Haemophilus influenzae, and Methanococcus jannaschii genomes for unidentified or misidentified protein-coding genes. We found at least 9 new protein-coding genes in the three genomes and at least 35 genes with potentially incorrect boundaries.
Similar articles
- transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences.
Bininda-Emonds OR. Bininda-Emonds OR. BMC Bioinformatics. 2005 Jun 22;6:156. doi: 10.1186/1471-2105-6-156. BMC Bioinformatics. 2005. PMID: 15969769 Free PMC article. - A tool for analyzing and annotating genomic sequences.
Huang X, Adams MD, Zhou H, Kerlavage AR. Huang X, et al. Genomics. 1997 Nov 15;46(1):37-45. doi: 10.1006/geno.1997.4984. Genomics. 1997. PMID: 9403056 - Finding homologs to nucleic acid or protein sequences using the framesearch program.
Healy M. Healy M. Curr Protoc Bioinformatics. 2002 Aug;Chapter 3:Unit 3.2. doi: 10.1002/0471250953.bi0302s00. Curr Protoc Bioinformatics. 2002. PMID: 18792937 Review. - Alternative Reading Frames are an Underappreciated Source of Protein Sequence Novelty.
Ardern Z. Ardern Z. J Mol Evol. 2023 Oct;91(5):570-580. doi: 10.1007/s00239-023-10122-3. Epub 2023 Jun 16. J Mol Evol. 2023. PMID: 37326679 Review.
Cited by
- Gra-CRC-miRTar: The pre-trained nucleotide-to-graph neural networks to identify potential miRNA targets in colorectal cancer.
Yin R, Zhao H, Li L, Yang Q, Zeng M, Yang C, Bian J, Xie M. Yin R, et al. Comput Struct Biotechnol J. 2024 Jul 18;23:3020-3029. doi: 10.1016/j.csbj.2024.07.014. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39171252 Free PMC article. - Sensitive and error-tolerant annotation of protein-coding DNA with BATH.
Krause GR, Shands W, Wheeler TJ. Krause GR, et al. Bioinform Adv. 2024 Jun 14;4(1):vbae088. doi: 10.1093/bioadv/vbae088. eCollection 2024. Bioinform Adv. 2024. PMID: 38966592 Free PMC article. - Gra-CRC-miRTar: The pre-trained nucleotide-to-graph neural networks to identify potential miRNA targets in colorectal cancer.
Yin R, Zhao H, Li L, Yang Q, Zeng M, Yang C, Bian J, Xie M. Yin R, et al. bioRxiv [Preprint]. 2024 Apr 20:2024.04.15.589599. doi: 10.1101/2024.04.15.589599. bioRxiv. 2024. PMID: 38659732 Free PMC article. Updated. Preprint. - Sensitive and error-tolerant annotation of protein-coding DNA with BATH.
Krause GR, Shands W, Wheeler TJ. Krause GR, et al. bioRxiv [Preprint]. 2024 Jan 1:2023.12.31.573773. doi: 10.1101/2023.12.31.573773. bioRxiv. 2024. PMID: 38260252 Free PMC article. Updated. Preprint. - Identification of potential pathogenic targets and survival strategies of Vibrio vulnificus through population genomics.
Zhang JX, Yuan Y, Hu QH, Jin DZ, Bai Y, Xin WW, Kang L, Wang JL. Zhang JX, et al. Front Cell Infect Microbiol. 2023 Aug 25;13:1254379. doi: 10.3389/fcimb.2023.1254379. eCollection 2023. Front Cell Infect Microbiol. 2023. PMID: 37692161 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Other Literature Sources
Miscellaneous