AL2CO: calculation of positional conservation in a protein sequence alignment - PubMed (original) (raw)
Comparative Study
AL2CO: calculation of positional conservation in a protein sequence alignment
J Pei et al. Bioinformatics. 2001 Aug.
Abstract
Motivation: Amino acid sequence alignments are widely used in the analysis of protein structure, function and evolutionary relationships. Proteins within a superfamily usually share the same fold and possess related functions. These structural and functional constraints are reflected in the alignment conservation patterns. Positions of functional and/or structural importance tend to be more conserved. Conserved positions are usually clustered in distinct motifs surrounded by sequence segments of low conservation. Poorly conserved regions might also arise from the imperfections in multiple alignment algorithms and thus indicate possible alignment errors. Quantification of conservation by attributing a conservation index to each aligned position makes motif detection more convenient. Mapping these conservation indices onto a protein spatial structure helps to visualize spatial conservation features of the molecule and to predict functionally and/or structurally important sites. Analysis of conservation indices could be a useful tool in detection of potentially misaligned regions and will aid in improvement of multiple alignments.
Results: We developed a program to calculate a conservation index at each position in a multiple sequence alignment using several methods. Namely, amino acid frequencies at each position are estimated and the conservation index is calculated from these frequencies. We utilize both unweighted frequencies and frequencies weighted using two different strategies. Three conceptually different approaches (entropy-based, variance-based and matrix score-based) are implemented in the algorithm to define the conservation index. Calculating conservation indices for 35522 positions in 284 alignments from SMART database we demonstrate that different methods result in highly correlated (correlation coefficient more than 0.85) conservation indices. Conservation indices show statistically significant correlation between sequentially adjacent positions i and i + j, where j < 13, and averaging of the indices over the window of three positions is optimal for motif detection. Positions with gaps display substantially lower conservation properties. We compare conservation properties of the SMART alignments or FSSP structural alignments to those of the ClustalW alignments. The results suggest that conservation indices should be a valuable tool of alignment quality assessment and might be used as an objective function for refinement of multiple alignments.
Availability: The C code of the AL2CO program and its pre-compiled versions for several platforms as well as the details of the analysis are freely available at ftp://iole.swmed.edu/pub/al2co/.
Similar articles
- PCOAT: positional correlation analysis using multiple methods.
Qi Y, Grishin NV. Qi Y, et al. Bioinformatics. 2004 Dec 12;20(18):3697-9. doi: 10.1093/bioinformatics/bth431. Epub 2004 Jul 22. Bioinformatics. 2004. PMID: 15271785 - Prediction of functional specificity determinants from protein sequences using log-likelihood ratios.
Pei J, Cai W, Kinch LN, Grishin NV. Pei J, et al. Bioinformatics. 2006 Jan 15;22(2):164-71. doi: 10.1093/bioinformatics/bti766. Epub 2005 Nov 8. Bioinformatics. 2006. PMID: 16278237 - OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy.
Raghava GP, Searle SM, Audley PC, Barber JD, Barton GJ. Raghava GP, et al. BMC Bioinformatics. 2003 Oct 10;4:47. doi: 10.1186/1471-2105-4-47. BMC Bioinformatics. 2003. PMID: 14552658 Free PMC article. - Sequence and structure alignments in post-AlphaFold era.
Rajapaksa S, Konagurthu AS, Lesk AM. Rajapaksa S, et al. Curr Opin Struct Biol. 2023 Apr;79:102539. doi: 10.1016/j.sbi.2023.102539. Epub 2023 Feb 6. Curr Opin Struct Biol. 2023. PMID: 36753924 Review. - Prediction of protein contacts from correlated sequence substitutions.
Sadowski MI, Taylor WR. Sadowski MI, et al. Sci Prog. 2013;96(Pt 1):33-42. doi: 10.3184/003685013X13612883013639. Sci Prog. 2013. PMID: 23738436 Free PMC article. Review.
Cited by
- Comparison of insect and human cytochrome b561 proteins: Insights into candidate ferric reductases in insects.
Holst JD, Murphy LG, Gorman MJ, Ragan EJ. Holst JD, et al. PLoS One. 2023 Dec 1;18(12):e0291564. doi: 10.1371/journal.pone.0291564. eCollection 2023. PLoS One. 2023. PMID: 38039324 Free PMC article. - Network evolution: rewiring and signatures of conservation in signaling.
Sun MG, Sikora M, Costanzo M, Boone C, Kim PM. Sun MG, et al. PLoS Comput Biol. 2012;8(3):e1002411. doi: 10.1371/journal.pcbi.1002411. Epub 2012 Mar 15. PLoS Comput Biol. 2012. PMID: 22438796 Free PMC article. - Secretory CAZymes profile and GH19 enzymes analysis of Corallococcus silvisoli c25j21.
Zhou X, Zhou X, Zhang X, Dong H, Dong Y, Zhu H. Zhou X, et al. Front Microbiol. 2024 Feb 5;15:1324153. doi: 10.3389/fmicb.2024.1324153. eCollection 2024. Front Microbiol. 2024. PMID: 38374914 Free PMC article. - Automatic assessment of alignment quality.
Lassmann T, Sonnhammer EL. Lassmann T, et al. Nucleic Acids Res. 2005 Dec 16;33(22):7120-8. doi: 10.1093/nar/gki1020. Print 2005. Nucleic Acids Res. 2005. PMID: 16361270 Free PMC article. - Structure and functional implications of WYL domain-containing bacterial DNA damage response regulator PafBC.
Müller AU, Leibundgut M, Ban N, Weber-Ban E. Müller AU, et al. Nat Commun. 2019 Oct 11;10(1):4653. doi: 10.1038/s41467-019-12567-x. Nat Commun. 2019. PMID: 31604936 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources