Andrzej K Konopka - Profile on Academia.edu (original) (raw)

Uploads

Papers by Andrzej K Konopka

Research paper thumbnail of Grand metaphors of biology in the genome era

Computers and Chemistry, 2002

Research paper thumbnail of A MAXIMUM ENTROPY PRINCIPLE FOR THE DISTRIBUTION OF LOCAL COMPLEXITY IN NATURALLY OCCURRING NUCLEOTIDE SEQUENCES

Computers and Chemistry, 1992

A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleoti... more A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleotides from large collections of functionally equivalent sequences is presented. The principle is seen to work well in both translated regions (exons and bacterial genes) and introns from various genomes. It also works in cases of sample sequences from various genomes and even a representative sample of the entire GenBank. This suggests that all naturally occurring DNA sequences are likely to follow the MEP described in this report. The linear trend of surprisal as a function of complexity is systematically characterized by remarkably different slope values for introns and translated regions of genes from all eukaryotic genomes studied (primates, rodents, other mammals, other vertebrates, invertebrates, organella, plants and viruses). This fact may be used as a criterion for discriminant analysis.

Research paper thumbnail of Theory of degenerate coding and informational parameters of protein coding genes.

Biochimie, 1985

R~sum~ --La thdorie du code ddgdndrd est prdsentde sous une forme qui permet de nouvelles applica... more R~sum~ --La thdorie du code ddgdndrd est prdsentde sous une forme qui permet de nouvelles applications en biologie moldculaire. I1 y a deux sortes de redondance dans un code ddgdndrd. La premiere est due ?~ la longueur excessive du codon et la seconde h la ddgdndrescence du code. Si le code est ddgdndrd asymdtriquement, la redondance du deuxidme type peut 6tre profitable pour controler le taux d'erreur. Ce contr61e peut £tre opdrd par l'utilisation sdlective des codons synonymes. L'utilisation de ia ddgdndrescence du code gdndtique est influencde partiellement par cette possibilitd thdorique. En particulier, le taux de protection contre les erreurs est corrdld ~ la ddviation de l'dquiprobabilitd dans l'utilisation des codons synonymes. La signification biologique de cette observation est considdrde. Summary --The theory of degenerate coding is presented in a way enabling further application to molecular biology. There are two kinds of redundancy of a degenerate code. The first is due to the excess in codon length and the second to the code degeneracy. If the code is asymmetrically degenerate, the second kind of redundancy can be profitable for control of error rate. This control can be performed just by selective synonymous codon usage. Utilisation of the genetic code is partially influenced by this theoretical possibility. In particular the degree of error protectivity is well correlated with deviation from equiprobability in synonymous codon usage. The biological significance of this fact is discussed.

Research paper thumbnail of DISTAN -A program which detects significant distances between short oligonucleotides

CABIOS, 1987

We present an algorithm to detect distances between oligonucleotides in large collections of nucl... more We present an algorithm to detect distances between oligonucleotides in large collections of nucleic acids sequences. The ratios of actual frequencies of occurrence of short oligonucleotides at a given distance to the corresponding expected frequencies were analyzed in four categories of DNA sequences (eukaryotic exons, bacterial genes, introns and non-Alu repeated DNAs). Three base periodic occurrences (independent of the reading frame) of all combinations of mononucleotides and repeats of all dinucleotides was characteristic for protein coding regions. This was also the case with the majority of trinucleotides (including translational stop signals) in these regions. Mirror-symmetric trinucleotides (except GCG and CGC) displayed a strong tendency to be two base periodically repeated in introns. Some two and three base periodic motifs were also observed in repeated DNAs. The possible biological implications of outstanding three base periodicities in bacterial genes and eukaryotic exons are discussed.

Research paper thumbnail of A dictionary of programs for sequence analysis (Appendix 2)

A dictionary of programs for sequence analysis (Appendix 2)

Research paper thumbnail of The 3′-orf protein of human immunodeficiency virus shows structural homology with the phosphorylation domain of human interleukin-2 receptor and the ATP-binding site of the protein kinase family

The 3′-orf protein of human immunodeficiency virus shows structural homology with the phosphorylation domain of human interleukin-2 receptor and the ATP-binding site of the protein kinase family

FEBS Letters, 1987

The primary amino acid sequence within a stretch of 25 residues (positions 91-116) of the middle ... more The primary amino acid sequence within a stretch of 25 residues (positions 91-116) of the middle portion of the 3'-orf protein (p27(3')-orf) of the human immunodeficiency virus (HIV) shares structural homology with a highly charged region within the intracytoplasmic phosphorylation domain of human interleukin-2 receptor (IL-2R) and the ATP-binding site of the catalytic subunit of cAMP-dependent protein kinase (cAMP-PK) and other members of the protein kinase family. Comparison of the predicted secondary structure within this region of p27(3')-orf with the phosphorylation domain of human IL-2R and the ATP-binding region of the phospho-kinase family of protein suggests that the 3'-orf protein could serve homologous function(s).

Research paper thumbnail of Unusual frequencies of certain alternating purine-pyrimidine runs in natural DNA sequences: relation to Z-DNA

Research paper thumbnail of This is Biology: The Science of the Living World (BOOK REVIEW)

Computers and Chemistry 26 (2002) 543–545, 2002

With a good probability every student of biology already knows that biology has very little (if ... more With a good probability every student of biology
already knows that biology has very little (if anything
at all) in common with physics. Yet it is not easy to
explain what is the exact nature of their differences.
Mayr’s book is a well-researched monograph devoted
to the methodological identity of the science of biology
and to a demonstration that methods of biology differ
from methods of physics.

Research paper thumbnail of All we need is truth

Computational Biology and Chemistry, 2004

Research paper thumbnail of Editors’ note concerning an EST analysis pipeline by Zhu et al. (2006)

Computational Biology and Chemistry, 2008

Research paper thumbnail of Measuring quality of research: what do they mean and why they mean so?

Computational Biology and Chemistry, 2004

Research paper thumbnail of A fixed-point alignment technique for detection of recurrent and common sequence motifs associated with biological features

Bioinformatics, 1988

A fixed-point alignment analysis technique is presented which is designed to locate common sequen... more A fixed-point alignment analysis technique is presented which is designed to locate common sequence motifs in collections of proteins or nucleic acids. Initially a program aligns a collection of sequences by a common sequence pattern or known biological feature. The common pattern or feature (fixed-point) may be a user-specified sequence string or a known sequence position like mRNA start site, which may be taken directly from the annotated feature table of GenBank. Once all alignment markers are located, the sequences are scanned for occurrences of given oligomers within a specified span both upstream and downstream of the fixed-point. The occurrences may then be plotted as a function of the position relative to the fixed-point, displayed as an actual sequence alignment or selectively summarized via various program options. Applications of the technique are discussed.

Research paper thumbnail of DISTAN — A program which detects significant distances between short oligonucleotides

DISTAN — A program which detects significant distances between short oligonucleotides

Bioinformatics, 1987

ABSTRACT

Research paper thumbnail of The missense errors in protein can be controlled by selective synonymous codon usage at the level of transcription

Biochimie, 1985

In the cases of the 6-fold degenerate residues and the stop signal, selective codon usage at the ... more In the cases of the 6-fold degenerate residues and the stop signal, selective codon usage at the level of transcription can account for a 10-20% variation in their mistranslation rate. For all other residues, the mistranslation rate is dependent upon the degree of degeneracy only, but not upon the pattern of synonymous codon usage.

Research paper thumbnail of Computational Molecular Biology: From Sequence Research to Software Development

Computational Molecular Biology: From Sequence Research to Software Development

Computational Biology and Chemistry / Computers & Chemistry, 1993

Research paper thumbnail of Sequences and Codes: Fundamentals of Biomolecular Cryptology

The chapter reviews classical methods of cryptanalysis and their application to nucleic acids and... more The chapter reviews classical methods of cryptanalysis and their application to nucleic acids and protein sequence analysis.

Research paper thumbnail of Theory of degenerate coding and informational parameters of protein coding genes.

The theory of degenerate coding is presented in a way enabling further application to molecular b... more The theory of degenerate coding is presented in a way enabling further application to molecular biology. There are two kinds of redundancy of a degenerate code. The first is due to the excess in codon length and the second to the code degeneracy. If the code is asymmetrically degenerate, the second kind of redundancy can be profitable for control of error rate. This control can be performed just by selective synonymous codon usage. Utilization of the genetic code is partially influenced by this theoretical possibility. In particular the degree of error protectivity is well correlated with deviation from equiprobability in synonymous codon usage. The biological significance of this fact is discussed.

Research paper thumbnail of  Surrogacy theory and models of convoluted organic systems

Proteomics 2007, 7, 846–856, 2007

The theory of surrogacy is briefly outlined as one of the conceptual foundations of systems biol-... more The theory of surrogacy is briefly outlined as one of the conceptual foundations of systems biol-ogy that has been developed for the last 30 years in the context of
Hertz-Rosen modeling rela-tionship. Conceptual foundations of modeling convoluted (biologically complex) systems are briefly reviewed and discussed in terms of current and future research in systems biology. New as well as older results that pertain to the concepts of modeling relationship, sequence of surrogacies, cascade of representations, complementarity, analogy, metaphor, and epistemic time are presented together with a classification of models in a cascade. Examples of anticipated future applications of surrogacy theory in life sciences are briefly discussed.

Research paper thumbnail of A Maximum Entropy Pronciple for the distribution of Local Complexity in naturally occuring nucleotide sequences

A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleoti... more A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleotides from large collections of functionally equivalent sequences is presented. The principle is seen to work well in both translated regions (exons and bacterial genes) and introns from various genomes.

Research paper thumbnail of Selected dreams and nightmares about computational biology

Research paper thumbnail of Grand metaphors of biology in the genome era

Computers and Chemistry, 2002

Research paper thumbnail of A MAXIMUM ENTROPY PRINCIPLE FOR THE DISTRIBUTION OF LOCAL COMPLEXITY IN NATURALLY OCCURRING NUCLEOTIDE SEQUENCES

Computers and Chemistry, 1992

A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleoti... more A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleotides from large collections of functionally equivalent sequences is presented. The principle is seen to work well in both translated regions (exons and bacterial genes) and introns from various genomes. It also works in cases of sample sequences from various genomes and even a representative sample of the entire GenBank. This suggests that all naturally occurring DNA sequences are likely to follow the MEP described in this report. The linear trend of surprisal as a function of complexity is systematically characterized by remarkably different slope values for introns and translated regions of genes from all eukaryotic genomes studied (primates, rodents, other mammals, other vertebrates, invertebrates, organella, plants and viruses). This fact may be used as a criterion for discriminant analysis.

Research paper thumbnail of Theory of degenerate coding and informational parameters of protein coding genes.

Biochimie, 1985

R~sum~ --La thdorie du code ddgdndrd est prdsentde sous une forme qui permet de nouvelles applica... more R~sum~ --La thdorie du code ddgdndrd est prdsentde sous une forme qui permet de nouvelles applications en biologie moldculaire. I1 y a deux sortes de redondance dans un code ddgdndrd. La premiere est due ?~ la longueur excessive du codon et la seconde h la ddgdndrescence du code. Si le code est ddgdndrd asymdtriquement, la redondance du deuxidme type peut 6tre profitable pour controler le taux d'erreur. Ce contr61e peut £tre opdrd par l'utilisation sdlective des codons synonymes. L'utilisation de ia ddgdndrescence du code gdndtique est influencde partiellement par cette possibilitd thdorique. En particulier, le taux de protection contre les erreurs est corrdld ~ la ddviation de l'dquiprobabilitd dans l'utilisation des codons synonymes. La signification biologique de cette observation est considdrde. Summary --The theory of degenerate coding is presented in a way enabling further application to molecular biology. There are two kinds of redundancy of a degenerate code. The first is due to the excess in codon length and the second to the code degeneracy. If the code is asymmetrically degenerate, the second kind of redundancy can be profitable for control of error rate. This control can be performed just by selective synonymous codon usage. Utilisation of the genetic code is partially influenced by this theoretical possibility. In particular the degree of error protectivity is well correlated with deviation from equiprobability in synonymous codon usage. The biological significance of this fact is discussed.

Research paper thumbnail of DISTAN -A program which detects significant distances between short oligonucleotides

CABIOS, 1987

We present an algorithm to detect distances between oligonucleotides in large collections of nucl... more We present an algorithm to detect distances between oligonucleotides in large collections of nucleic acids sequences. The ratios of actual frequencies of occurrence of short oligonucleotides at a given distance to the corresponding expected frequencies were analyzed in four categories of DNA sequences (eukaryotic exons, bacterial genes, introns and non-Alu repeated DNAs). Three base periodic occurrences (independent of the reading frame) of all combinations of mononucleotides and repeats of all dinucleotides was characteristic for protein coding regions. This was also the case with the majority of trinucleotides (including translational stop signals) in these regions. Mirror-symmetric trinucleotides (except GCG and CGC) displayed a strong tendency to be two base periodically repeated in introns. Some two and three base periodic motifs were also observed in repeated DNAs. The possible biological implications of outstanding three base periodicities in bacterial genes and eukaryotic exons are discussed.

Research paper thumbnail of A dictionary of programs for sequence analysis (Appendix 2)

A dictionary of programs for sequence analysis (Appendix 2)

Research paper thumbnail of The 3′-orf protein of human immunodeficiency virus shows structural homology with the phosphorylation domain of human interleukin-2 receptor and the ATP-binding site of the protein kinase family

The 3′-orf protein of human immunodeficiency virus shows structural homology with the phosphorylation domain of human interleukin-2 receptor and the ATP-binding site of the protein kinase family

FEBS Letters, 1987

The primary amino acid sequence within a stretch of 25 residues (positions 91-116) of the middle ... more The primary amino acid sequence within a stretch of 25 residues (positions 91-116) of the middle portion of the 3'-orf protein (p27(3')-orf) of the human immunodeficiency virus (HIV) shares structural homology with a highly charged region within the intracytoplasmic phosphorylation domain of human interleukin-2 receptor (IL-2R) and the ATP-binding site of the catalytic subunit of cAMP-dependent protein kinase (cAMP-PK) and other members of the protein kinase family. Comparison of the predicted secondary structure within this region of p27(3')-orf with the phosphorylation domain of human IL-2R and the ATP-binding region of the phospho-kinase family of protein suggests that the 3'-orf protein could serve homologous function(s).

Research paper thumbnail of Unusual frequencies of certain alternating purine-pyrimidine runs in natural DNA sequences: relation to Z-DNA

Research paper thumbnail of This is Biology: The Science of the Living World (BOOK REVIEW)

Computers and Chemistry 26 (2002) 543–545, 2002

With a good probability every student of biology already knows that biology has very little (if ... more With a good probability every student of biology
already knows that biology has very little (if anything
at all) in common with physics. Yet it is not easy to
explain what is the exact nature of their differences.
Mayr’s book is a well-researched monograph devoted
to the methodological identity of the science of biology
and to a demonstration that methods of biology differ
from methods of physics.

Research paper thumbnail of All we need is truth

Computational Biology and Chemistry, 2004

Research paper thumbnail of Editors’ note concerning an EST analysis pipeline by Zhu et al. (2006)

Computational Biology and Chemistry, 2008

Research paper thumbnail of Measuring quality of research: what do they mean and why they mean so?

Computational Biology and Chemistry, 2004

Research paper thumbnail of A fixed-point alignment technique for detection of recurrent and common sequence motifs associated with biological features

Bioinformatics, 1988

A fixed-point alignment analysis technique is presented which is designed to locate common sequen... more A fixed-point alignment analysis technique is presented which is designed to locate common sequence motifs in collections of proteins or nucleic acids. Initially a program aligns a collection of sequences by a common sequence pattern or known biological feature. The common pattern or feature (fixed-point) may be a user-specified sequence string or a known sequence position like mRNA start site, which may be taken directly from the annotated feature table of GenBank. Once all alignment markers are located, the sequences are scanned for occurrences of given oligomers within a specified span both upstream and downstream of the fixed-point. The occurrences may then be plotted as a function of the position relative to the fixed-point, displayed as an actual sequence alignment or selectively summarized via various program options. Applications of the technique are discussed.

Research paper thumbnail of DISTAN — A program which detects significant distances between short oligonucleotides

DISTAN — A program which detects significant distances between short oligonucleotides

Bioinformatics, 1987

ABSTRACT

Research paper thumbnail of The missense errors in protein can be controlled by selective synonymous codon usage at the level of transcription

Biochimie, 1985

In the cases of the 6-fold degenerate residues and the stop signal, selective codon usage at the ... more In the cases of the 6-fold degenerate residues and the stop signal, selective codon usage at the level of transcription can account for a 10-20% variation in their mistranslation rate. For all other residues, the mistranslation rate is dependent upon the degree of degeneracy only, but not upon the pattern of synonymous codon usage.

Research paper thumbnail of Computational Molecular Biology: From Sequence Research to Software Development

Computational Molecular Biology: From Sequence Research to Software Development

Computational Biology and Chemistry / Computers & Chemistry, 1993

Research paper thumbnail of Sequences and Codes: Fundamentals of Biomolecular Cryptology

The chapter reviews classical methods of cryptanalysis and their application to nucleic acids and... more The chapter reviews classical methods of cryptanalysis and their application to nucleic acids and protein sequence analysis.

Research paper thumbnail of Theory of degenerate coding and informational parameters of protein coding genes.

The theory of degenerate coding is presented in a way enabling further application to molecular b... more The theory of degenerate coding is presented in a way enabling further application to molecular biology. There are two kinds of redundancy of a degenerate code. The first is due to the excess in codon length and the second to the code degeneracy. If the code is asymmetrically degenerate, the second kind of redundancy can be profitable for control of error rate. This control can be performed just by selective synonymous codon usage. Utilization of the genetic code is partially influenced by this theoretical possibility. In particular the degree of error protectivity is well correlated with deviation from equiprobability in synonymous codon usage. The biological significance of this fact is discussed.

Research paper thumbnail of  Surrogacy theory and models of convoluted organic systems

Proteomics 2007, 7, 846–856, 2007

The theory of surrogacy is briefly outlined as one of the conceptual foundations of systems biol-... more The theory of surrogacy is briefly outlined as one of the conceptual foundations of systems biol-ogy that has been developed for the last 30 years in the context of
Hertz-Rosen modeling rela-tionship. Conceptual foundations of modeling convoluted (biologically complex) systems are briefly reviewed and discussed in terms of current and future research in systems biology. New as well as older results that pertain to the concepts of modeling relationship, sequence of surrogacies, cascade of representations, complementarity, analogy, metaphor, and epistemic time are presented together with a classification of models in a cascade. Examples of anticipated future applications of surrogacy theory in life sciences are briefly discussed.

Research paper thumbnail of A Maximum Entropy Pronciple for the distribution of Local Complexity in naturally occuring nucleotide sequences

A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleoti... more A maximum entropy principle (MEP) governing the distribution of complexity of short oligonucleotides from large collections of functionally equivalent sequences is presented. The principle is seen to work well in both translated regions (exons and bacterial genes) and introns from various genomes.

Research paper thumbnail of Selected dreams and nightmares about computational biology

Research paper thumbnail of Plausible Classification Codes and local Compositional Complexity of nucleotide sequences

OSTI.GOV Conference: · 31 December 1993;; OSTI ID:37524, 1993

Genomic DNA fragments are initially represented by sequences of symbols from an elementary alphab... more Genomic DNA fragments are initially represented by sequences of symbols from an elementary alphabet (such as the one that contains A, C, G, T, and N symbols for nucleotides). Although in theory scientists could consider alphabets of symbols that represent secondary and tertiary structures of biopolymers, the sequences of monomers are the only solid data they have thus far. Even in the presence of sequencing errors the reliability of sequence representation is far greater than higher-order structure representations available thus far. For this reason the author will focus on sequence representations only. Sequences can be classified according to a description of their biological function (``extrasequencial`` factual data). The result of such classification are collections of Functionally Equivalent Sequences (abbreviated as FESs in plural and as FES in singular). In order for FESs to be suitable for statistical analyses they should not contain sequences that are identical or almost identical to each other. As far as computational biology is concerned functions are represented by FESS. The challenge is to find patterns that could serve as indicators of a given sequence belonging to a given FES and not to other FESS. Before the search for function-associated patterns can be performed, patterns have to be defined and their (not necessarily statistical) significance has to be evaluated.

Research paper thumbnail of SYSTEMS BIOLOGY: aspects related to genomics

Cooper DN, ed. Nature Encyclopedia of the Human Genome. Vol. 5. London: Nature Publishing Group Reference, 2003:459-465., 2003

Systems biology is concerned with functionally complex systems (such as an organism, immune syste... more Systems biology is concerned with functionally complex systems (such as an organism, immune system or ecosystem) that can be approximately described by several complementary models but cannot be adequately represented by a single model. It addresses foundational issues of entire biology (such as system modelling, fractionation, integration, and emergence.) On the other hand it is also devoted to specific techniques for data integration and interpretation. This brings systems biology right in the
middle of the old controversy between reductionism and holism.

Research paper thumbnail of Sequences and Codes: Fundamentals of Biomolecular Cryptology

Biocomputing: Informatics and genome projects, 1994