Konstantin Kozlov | Saint Petersburg State Polytechnical University (SPBSPU) (original) (raw)

Uploads

Papers by Konstantin Kozlov

Research paper thumbnail of Dynamical Modeling of the Core Gene Network Controlling Flowering Suggests Cumulative Activation From the FLOWERING LOCUS T Gene Homologs in Chickpea

Frontiers in Genetics, Nov 20, 2018

Research paper thumbnail of Methodology for Building of Complex Workflows with Prostak Package and Isimbios

Research paper thumbnail of Impact of Negative Feedbacks on De Novo Pyrimidines Biosynthesis in Escherichia coli

International Journal of Molecular Sciences, Mar 2, 2023

Research paper thumbnail of Quality Control of Human Pluripotent Stem Cell Colonies by Computational Image Analysis Using Convolutional Neural Networks

International Journal of Molecular Sciences

Human pluripotent stem cells are promising for a wide range of research and therapeutic purposes.... more Human pluripotent stem cells are promising for a wide range of research and therapeutic purposes. Their maintenance in culture requires the deep control of their pluripotent and clonal status. A non-invasive method for such control involves day-to-day observation of the morphological changes, along with imaging colonies, with the subsequent automatic assessment of colony phenotype using image analysis by machine learning methods. We developed a classifier using a convolutional neural network and applied it to discriminate between images of human embryonic stem cell (hESC) colonies with “good” and “bad” morphological phenotypes associated with a high and low potential for pluripotency and clonality maintenance, respectively. The training dataset included the phase-contrast images of hESC line H9, in which the morphological phenotype of each colony was assessed through visual analysis. The classifier showed a high level of accuracy (89%) in phenotype prediction. By training the classi...

Research paper thumbnail of Quantitative variation and evolution of spatially explicit morphogen expression in Drosophila

bioRxiv (Cold Spring Harbor Laboratory), Aug 13, 2017

Robustness in development allows for the accumulation of neutral genetically based variation in e... more Robustness in development allows for the accumulation of neutral genetically based variation in expression, and here will be termed 'genetic stochasticity'. This largely neutral variation is potentially important for both evolution and complex disease phenotypes. However, it has generally only been investigated as variation exhibited in the response to large genetic perturbations. In addition, work on variation in gene expression has similarly generally been limited to being spatial, or quantitative, but because of technical restrictions not both. Here we bridge these gaps by investigating replicated quantitative spatial gene expression using rigorous statistical models, in different genotypes, sexes, and species (Drosophila melanogaster and D. simulans). Using this type of quantitative approach with developmental data allows for effective comparison among conditions, including health versus disease. We apply this approach to the morphogenetic furrow, a wave of differentiation that sweeps across the developing eye disc. Within the morphogenetic furrow, we focus on four conserved morphogens, hairy, atonal, hedgehog, and Delta. Hybridization chain reaction quantitatively measures spatial gene expression, co-staining for all four genes simultaneously and with minimal effort. We find considerable variation in the spatial expression pattern of these genes in the eye between species, genotypes, and sexes. We also find that there has been evolution of the regulatory relationship between these genes. Lastly, we show that the spatial interrelationships of these genes evolved between species in the morphogenetic furrow. This is essentially the first 'population genetics of development' as we are able to evaluate wild type differences in spatial and quantitative gene expression at the level of genotype, species and sex. .

Research paper thumbnail of Dynamical Modeling of the Core Gene Network Controlling Flowering Suggests Cumulative Activation From the FLOWERING LOCUS T Gene Homologs in Chickpea

Frontiers in Genetics, 2018

Research paper thumbnail of Modeling of Flowering Time in Vigna radiata with Artificial Image Objects, Convolutional Neural Network and Random Forest

Plants

Flowering time is an important target for breeders in developing new varieties adapted to changin... more Flowering time is an important target for breeders in developing new varieties adapted to changing conditions. In this work, a new approach is proposed in which the SNP markers influencing time to flowering in mung bean are selected as important features in a random forest model. The genotypic and weather data are encoded in artificial image objects, and a model for flowering time prediction is constructed as a convolutional neural network. The model uses weather data for only a limited time period of 5 days before and 20 days after planting and is capable of predicting the time to flowering with high accuracy. The most important factors for model solution were identified using saliency maps and a Score-CAM method. Our approach can help breeding programs harness genotypic and phenotypic diversity to more effectively produce varieties with a desired flowering time.

Research paper thumbnail of Solution of Mixed-Integer Optimization Problems in Bioinformatics with Differential Evolution Method

Research paper thumbnail of Non-linear regression models for time to flowering in wild chickpea combine genetic and climatic factors

Background Accurate prediction of crop flowering time is required for reaching maximal farm effic... more Background Accurate prediction of crop flowering time is required for reaching maximal farm efficiency. Several models developed to accomplish this goal are based on deep knowledge of plant phenology, requiring large investment for every individual crop or new variety. Mathematical modeling can be used to make better use of more shallow data and to extract information from it with higher efficiency. Cultivars of chickpea, Cicer arietanum, are currently being improved by introgressing wild C. reticulatum biodiversity with very different flowering time requirements. More understanding is required for how flowering time will depend on environmental conditions in these cultivars developed by introgression of wild alleles. Results We built a novel model for flowering time of wild chickpeas collected at 21 different sites in Turkey and grown in 4 distinct environmental conditions over several different years and seasons. We propose a general approach, in which the analytic forms of depend...

Research paper thumbnail of Additional file 1 of Non-linear regression models for time to flowering in wild chickpea combine genetic and climatic factors

Additional file 1 contains information on SNP based groups, climatic data for these groups, detai... more Additional file 1 contains information on SNP based groups, climatic data for these groups, details on Grammatical evolution method. (PDF 634 kb)

Research paper thumbnail of Differential Evolution Approach to Detect Recent Admixture

The genetic structure of human populations is extraordinarily com-plex and of fundamental importa... more The genetic structure of human populations is extraordinarily com-plex and of fundamental importance to studies of anthropology, evo-lution, and medicine. As increasingly many individuals are of mixed origin, there is an unmet need for tools that can infer multiple ori-gins. Misclassication of such individuals can lead to incorrect and costly misinterpretations of genomic data, primarily in disease stud-ies and drug trials. We present an advanced tool to infer ancestry that can identify the biogeographic origins of highly mixed individ-uals. reAdmix can incorporate individual's knowledge of ancestors (e.g. having some ancestors from Turkey or a Scottish grandmother). reAdmix is an online tool available at

Research paper thumbnail of Combined Optimization Technique for Biological Data Fitting

Motivation: Modern molecular biology has massive amounts of quantitative data already at its disp... more Motivation: Modern molecular biology has massive amounts of quantitative data already at its disposal. The crucially important problem for getting closer insights into mechanisms of development is to reduce the complexity of finding the parameters of mathematical models by fitting to experimental data. Results: The new Combined Optimization Technique (COT) showed a high accuracy in reconstruction of phenomenological parameters of equations and saved about 30 % of the most time consuming operations in computation that allow to propose the COT as quite attractive instrument for processing big amounts of experimental data of various nature. Availability: available on request from the authors

Research paper thumbnail of Simulation Model for Time to Flowering with Climatic and Genetic Inputs for Wild Chickpea

Agronomy

Accurate prediction of flowering time helps breeders to develop new varieties that can achieve ma... more Accurate prediction of flowering time helps breeders to develop new varieties that can achieve maximal efficiency in a changing climate. A methodology was developed for the construction of a simulation model for flowering time in which a function for daily progression of the plant from one to the next phenological phase is obtained in analytic form by stochastic minimization. The resulting model demonstrated high accuracy on the recently assembled data set of wild chickpeas. The inclusion of genotype-by-climatic factors interactions accounted to 77% of accuracy in terms of root mean square error. It was found that the impact of minimal temperature is positively correlated with the longitude at primary collection sites, while the impact of day length is negatively correlated. It was interpreted as adaptation of accessions from highlands to lower temperatures and those from lower elevation river valleys to shorter days. We used bootstrap resampling to construct an ensemble of models, ...

Research paper thumbnail of Forecasting the Timing of Floral Initiation in Wild Chickpeas under Climate Change

Biophysics

Precise prediction of the timing of floral initiation helps breeders create new varieties that ca... more Precise prediction of the timing of floral initiation helps breeders create new varieties that can achieve maximum efficiency under the influence of a changing climate. A previously constructed model was used to compare the impact of daily weather parameters on the flowering time of wild varieties of chickpeas that were collected in different geographic locations in Turkey. We found that plants from the high altitude areas, unlike plant samples from lower altitudes, can adapt to lower temperatures and longer days. Forecasts of changes in time to flowering in the studied wild chickpea varieties were made with the model and climate change predictions using MarkSim software to generate daily weather data for Ankara. The mean thresholds for the sowing flowering period for the 2020–2039, 2040–2059, and 2060–2080 time periods shifted for 21 combinations of the scenarios of plant growth and development and plant collecting sites, accounting for approximately half of the 40 cases, thereby suggesting a moderate effect of climate change on flowering time in the studied varieties.

Research paper thumbnail of A dual role for DNA-binding by Runt in activation and repression of sloppy paired transcription

Molecular Biology of the Cell

This work investigates the role of DNA-binding by Runt in regulating the sloppy-paired-1 ( slp1) ... more This work investigates the role of DNA-binding by Runt in regulating the sloppy-paired-1 ( slp1) gene, and in particular two distinct cis-regulatory elements that mediate regulation by Runt and other pair-rule transcription factors during Drosophila segmentation. We find that a DNA-binding defective form of Runt is ineffective at repressing both the distal (DESE) and proximal (PESE) early stripe elements of slp1 and is also compromised for DESE-dependent activation. The function of Runt-binding sites in DESE is further investigated using site-specific transgenesis and quantitative imaging techniques. When DESE is tested as an autonomous enhancer, mutagenesis of the Runt sites results in a clear loss of Runt-dependent repression but has little to no effect on Runt-dependent activation. Notably, mutagenesis of these same sites in the context of a reporter gene construct that also contains the PESE enhancer results in a significant reduction of DESE-dependent activation as well as the ...

Research paper thumbnail of The differential evolution entirely parallel method for model adaptation in systems biology

Biophysics

The differential evolution entirely parallel method has been developed to enable the identificati... more The differential evolution entirely parallel method has been developed to enable the identification of unknown parameters of mathematical models by minimization of the deviation of the solution from experimental data. The method is implemented in a free open-source software that is downloadable from the Internet. The results of processing of test functions showed that the accuracy of the method is comparable to that of the three best algorithms from CEC-2014. The method has been successfully used in a number of real biological problems.

Research paper thumbnail of Combined Sequenced-Based Model of the Drosophila Gap Gene Network

Advanced Techniques in Biology & Medicine, 2015

Research paper thumbnail of Modeling of Flowering Time in Vigna radiata with Approximate Bayesian Computation

Agronomy

Flowering time is an important target for breeders in developing new varieties adapted to changin... more Flowering time is an important target for breeders in developing new varieties adapted to changing conditions. A new approach is proposed that uses Approximate Bayesian Computation with Differential Evolution to construct a pool of models for flowering time. The functions for daily progression of the plant from planting to flowering are obtained in analytic form and depend on daily values of climatic factors and genetic information. The resulting pool of models demonstrated high accuracy on the dataset. Day length, solar radiation and temperature had a large impact on the model accuracy, while the impact of precipitation was comparatively small and the impact of maximal temperature has the maximal variation. The model pool was used to investigate the behavior of accessions from the dataset in case of temperature increase by 0.05–6.00°. The time to flowering changed differently for different accessions. The Pearson correlation coefficient between the SNP value and the change in time ...

Research paper thumbnail of Dynamical climatic model for time to flowering in Vigna radiata

BMC Plant Biology

Background Phenology data collected recently for about 300 accessions of Vigna radiata (mungbean)... more Background Phenology data collected recently for about 300 accessions of Vigna radiata (mungbean) is an invaluable resource for investigation of impacts of climatic factors on plant development. Results We developed a new mathematical model that describes the dynamic control of time to flowering by daily values of maximal and minimal temperature, precipitation, day length and solar radiation. We obtained model parameters by adaptation to the available experimental data. The models were validated by cross-validation and used to demonstrate that the phenology of adaptive traits, like flowering time, is strongly predicted not only by local environmental factors but also by plant geographic origin and genotype. Conclusions Of local environmental factors maximal temperature appeared to be the most critical factor determining how faithfully the model describes the data. The models were applied to forecast time to flowering of accessions grown in Taiwan in future years 2020-2030.

Research paper thumbnail of Simulation of Soybean Phenology with the Use of Artificial Neural Networks

Research paper thumbnail of Dynamical Modeling of the Core Gene Network Controlling Flowering Suggests Cumulative Activation From the FLOWERING LOCUS T Gene Homologs in Chickpea

Frontiers in Genetics, Nov 20, 2018

Research paper thumbnail of Methodology for Building of Complex Workflows with Prostak Package and Isimbios

Research paper thumbnail of Impact of Negative Feedbacks on De Novo Pyrimidines Biosynthesis in Escherichia coli

International Journal of Molecular Sciences, Mar 2, 2023

Research paper thumbnail of Quality Control of Human Pluripotent Stem Cell Colonies by Computational Image Analysis Using Convolutional Neural Networks

International Journal of Molecular Sciences

Human pluripotent stem cells are promising for a wide range of research and therapeutic purposes.... more Human pluripotent stem cells are promising for a wide range of research and therapeutic purposes. Their maintenance in culture requires the deep control of their pluripotent and clonal status. A non-invasive method for such control involves day-to-day observation of the morphological changes, along with imaging colonies, with the subsequent automatic assessment of colony phenotype using image analysis by machine learning methods. We developed a classifier using a convolutional neural network and applied it to discriminate between images of human embryonic stem cell (hESC) colonies with “good” and “bad” morphological phenotypes associated with a high and low potential for pluripotency and clonality maintenance, respectively. The training dataset included the phase-contrast images of hESC line H9, in which the morphological phenotype of each colony was assessed through visual analysis. The classifier showed a high level of accuracy (89%) in phenotype prediction. By training the classi...

Research paper thumbnail of Quantitative variation and evolution of spatially explicit morphogen expression in Drosophila

bioRxiv (Cold Spring Harbor Laboratory), Aug 13, 2017

Robustness in development allows for the accumulation of neutral genetically based variation in e... more Robustness in development allows for the accumulation of neutral genetically based variation in expression, and here will be termed 'genetic stochasticity'. This largely neutral variation is potentially important for both evolution and complex disease phenotypes. However, it has generally only been investigated as variation exhibited in the response to large genetic perturbations. In addition, work on variation in gene expression has similarly generally been limited to being spatial, or quantitative, but because of technical restrictions not both. Here we bridge these gaps by investigating replicated quantitative spatial gene expression using rigorous statistical models, in different genotypes, sexes, and species (Drosophila melanogaster and D. simulans). Using this type of quantitative approach with developmental data allows for effective comparison among conditions, including health versus disease. We apply this approach to the morphogenetic furrow, a wave of differentiation that sweeps across the developing eye disc. Within the morphogenetic furrow, we focus on four conserved morphogens, hairy, atonal, hedgehog, and Delta. Hybridization chain reaction quantitatively measures spatial gene expression, co-staining for all four genes simultaneously and with minimal effort. We find considerable variation in the spatial expression pattern of these genes in the eye between species, genotypes, and sexes. We also find that there has been evolution of the regulatory relationship between these genes. Lastly, we show that the spatial interrelationships of these genes evolved between species in the morphogenetic furrow. This is essentially the first 'population genetics of development' as we are able to evaluate wild type differences in spatial and quantitative gene expression at the level of genotype, species and sex. .

Research paper thumbnail of Dynamical Modeling of the Core Gene Network Controlling Flowering Suggests Cumulative Activation From the FLOWERING LOCUS T Gene Homologs in Chickpea

Frontiers in Genetics, 2018

Research paper thumbnail of Modeling of Flowering Time in Vigna radiata with Artificial Image Objects, Convolutional Neural Network and Random Forest

Plants

Flowering time is an important target for breeders in developing new varieties adapted to changin... more Flowering time is an important target for breeders in developing new varieties adapted to changing conditions. In this work, a new approach is proposed in which the SNP markers influencing time to flowering in mung bean are selected as important features in a random forest model. The genotypic and weather data are encoded in artificial image objects, and a model for flowering time prediction is constructed as a convolutional neural network. The model uses weather data for only a limited time period of 5 days before and 20 days after planting and is capable of predicting the time to flowering with high accuracy. The most important factors for model solution were identified using saliency maps and a Score-CAM method. Our approach can help breeding programs harness genotypic and phenotypic diversity to more effectively produce varieties with a desired flowering time.

Research paper thumbnail of Solution of Mixed-Integer Optimization Problems in Bioinformatics with Differential Evolution Method

Research paper thumbnail of Non-linear regression models for time to flowering in wild chickpea combine genetic and climatic factors

Background Accurate prediction of crop flowering time is required for reaching maximal farm effic... more Background Accurate prediction of crop flowering time is required for reaching maximal farm efficiency. Several models developed to accomplish this goal are based on deep knowledge of plant phenology, requiring large investment for every individual crop or new variety. Mathematical modeling can be used to make better use of more shallow data and to extract information from it with higher efficiency. Cultivars of chickpea, Cicer arietanum, are currently being improved by introgressing wild C. reticulatum biodiversity with very different flowering time requirements. More understanding is required for how flowering time will depend on environmental conditions in these cultivars developed by introgression of wild alleles. Results We built a novel model for flowering time of wild chickpeas collected at 21 different sites in Turkey and grown in 4 distinct environmental conditions over several different years and seasons. We propose a general approach, in which the analytic forms of depend...

Research paper thumbnail of Additional file 1 of Non-linear regression models for time to flowering in wild chickpea combine genetic and climatic factors

Additional file 1 contains information on SNP based groups, climatic data for these groups, detai... more Additional file 1 contains information on SNP based groups, climatic data for these groups, details on Grammatical evolution method. (PDF 634 kb)

Research paper thumbnail of Differential Evolution Approach to Detect Recent Admixture

The genetic structure of human populations is extraordinarily com-plex and of fundamental importa... more The genetic structure of human populations is extraordinarily com-plex and of fundamental importance to studies of anthropology, evo-lution, and medicine. As increasingly many individuals are of mixed origin, there is an unmet need for tools that can infer multiple ori-gins. Misclassication of such individuals can lead to incorrect and costly misinterpretations of genomic data, primarily in disease stud-ies and drug trials. We present an advanced tool to infer ancestry that can identify the biogeographic origins of highly mixed individ-uals. reAdmix can incorporate individual's knowledge of ancestors (e.g. having some ancestors from Turkey or a Scottish grandmother). reAdmix is an online tool available at

Research paper thumbnail of Combined Optimization Technique for Biological Data Fitting

Motivation: Modern molecular biology has massive amounts of quantitative data already at its disp... more Motivation: Modern molecular biology has massive amounts of quantitative data already at its disposal. The crucially important problem for getting closer insights into mechanisms of development is to reduce the complexity of finding the parameters of mathematical models by fitting to experimental data. Results: The new Combined Optimization Technique (COT) showed a high accuracy in reconstruction of phenomenological parameters of equations and saved about 30 % of the most time consuming operations in computation that allow to propose the COT as quite attractive instrument for processing big amounts of experimental data of various nature. Availability: available on request from the authors

Research paper thumbnail of Simulation Model for Time to Flowering with Climatic and Genetic Inputs for Wild Chickpea

Agronomy

Accurate prediction of flowering time helps breeders to develop new varieties that can achieve ma... more Accurate prediction of flowering time helps breeders to develop new varieties that can achieve maximal efficiency in a changing climate. A methodology was developed for the construction of a simulation model for flowering time in which a function for daily progression of the plant from one to the next phenological phase is obtained in analytic form by stochastic minimization. The resulting model demonstrated high accuracy on the recently assembled data set of wild chickpeas. The inclusion of genotype-by-climatic factors interactions accounted to 77% of accuracy in terms of root mean square error. It was found that the impact of minimal temperature is positively correlated with the longitude at primary collection sites, while the impact of day length is negatively correlated. It was interpreted as adaptation of accessions from highlands to lower temperatures and those from lower elevation river valleys to shorter days. We used bootstrap resampling to construct an ensemble of models, ...

Research paper thumbnail of Forecasting the Timing of Floral Initiation in Wild Chickpeas under Climate Change

Biophysics

Precise prediction of the timing of floral initiation helps breeders create new varieties that ca... more Precise prediction of the timing of floral initiation helps breeders create new varieties that can achieve maximum efficiency under the influence of a changing climate. A previously constructed model was used to compare the impact of daily weather parameters on the flowering time of wild varieties of chickpeas that were collected in different geographic locations in Turkey. We found that plants from the high altitude areas, unlike plant samples from lower altitudes, can adapt to lower temperatures and longer days. Forecasts of changes in time to flowering in the studied wild chickpea varieties were made with the model and climate change predictions using MarkSim software to generate daily weather data for Ankara. The mean thresholds for the sowing flowering period for the 2020–2039, 2040–2059, and 2060–2080 time periods shifted for 21 combinations of the scenarios of plant growth and development and plant collecting sites, accounting for approximately half of the 40 cases, thereby suggesting a moderate effect of climate change on flowering time in the studied varieties.

Research paper thumbnail of A dual role for DNA-binding by Runt in activation and repression of sloppy paired transcription

Molecular Biology of the Cell

This work investigates the role of DNA-binding by Runt in regulating the sloppy-paired-1 ( slp1) ... more This work investigates the role of DNA-binding by Runt in regulating the sloppy-paired-1 ( slp1) gene, and in particular two distinct cis-regulatory elements that mediate regulation by Runt and other pair-rule transcription factors during Drosophila segmentation. We find that a DNA-binding defective form of Runt is ineffective at repressing both the distal (DESE) and proximal (PESE) early stripe elements of slp1 and is also compromised for DESE-dependent activation. The function of Runt-binding sites in DESE is further investigated using site-specific transgenesis and quantitative imaging techniques. When DESE is tested as an autonomous enhancer, mutagenesis of the Runt sites results in a clear loss of Runt-dependent repression but has little to no effect on Runt-dependent activation. Notably, mutagenesis of these same sites in the context of a reporter gene construct that also contains the PESE enhancer results in a significant reduction of DESE-dependent activation as well as the ...

Research paper thumbnail of The differential evolution entirely parallel method for model adaptation in systems biology

Biophysics

The differential evolution entirely parallel method has been developed to enable the identificati... more The differential evolution entirely parallel method has been developed to enable the identification of unknown parameters of mathematical models by minimization of the deviation of the solution from experimental data. The method is implemented in a free open-source software that is downloadable from the Internet. The results of processing of test functions showed that the accuracy of the method is comparable to that of the three best algorithms from CEC-2014. The method has been successfully used in a number of real biological problems.

Research paper thumbnail of Combined Sequenced-Based Model of the Drosophila Gap Gene Network

Advanced Techniques in Biology & Medicine, 2015

Research paper thumbnail of Modeling of Flowering Time in Vigna radiata with Approximate Bayesian Computation

Agronomy

Flowering time is an important target for breeders in developing new varieties adapted to changin... more Flowering time is an important target for breeders in developing new varieties adapted to changing conditions. A new approach is proposed that uses Approximate Bayesian Computation with Differential Evolution to construct a pool of models for flowering time. The functions for daily progression of the plant from planting to flowering are obtained in analytic form and depend on daily values of climatic factors and genetic information. The resulting pool of models demonstrated high accuracy on the dataset. Day length, solar radiation and temperature had a large impact on the model accuracy, while the impact of precipitation was comparatively small and the impact of maximal temperature has the maximal variation. The model pool was used to investigate the behavior of accessions from the dataset in case of temperature increase by 0.05–6.00°. The time to flowering changed differently for different accessions. The Pearson correlation coefficient between the SNP value and the change in time ...

Research paper thumbnail of Dynamical climatic model for time to flowering in Vigna radiata

BMC Plant Biology

Background Phenology data collected recently for about 300 accessions of Vigna radiata (mungbean)... more Background Phenology data collected recently for about 300 accessions of Vigna radiata (mungbean) is an invaluable resource for investigation of impacts of climatic factors on plant development. Results We developed a new mathematical model that describes the dynamic control of time to flowering by daily values of maximal and minimal temperature, precipitation, day length and solar radiation. We obtained model parameters by adaptation to the available experimental data. The models were validated by cross-validation and used to demonstrate that the phenology of adaptive traits, like flowering time, is strongly predicted not only by local environmental factors but also by plant geographic origin and genotype. Conclusions Of local environmental factors maximal temperature appeared to be the most critical factor determining how faithfully the model describes the data. The models were applied to forecast time to flowering of accessions grown in Taiwan in future years 2020-2030.

Research paper thumbnail of Simulation of Soybean Phenology with the Use of Artificial Neural Networks