Elena Czeizler | Aalto University (original) (raw)
Papers by Elena Czeizler
Int. J. Unconv. Comput., 2018
The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithm... more The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithmic design of DNA self-assembly systems. It employs the use of so-called DNA-tiles, which are abstractions of experimentally achievable DNA nanostructure complexes with similar inter-matching behaviours. To this day, there are about half-dozen different experimental implementations of DNA tiles and their sub-sequent algorithmic assembly into larger complexes, see e.g. Reif et al. 2012. In order to provide further insight into the assembly process, the aTAM model has been extended to a kinetic counterpart (kTAM). Although there is a wide abundance of different variants of the abstract model, e.g., stage, step, hierarchical, temperature-k, signal-passing, etc. (see e.g. Patitz 2012), numerical simulations of the kinetic counterpart have been performed only for a few types of these systems. This might be due to the fact that the numerical models and simulations of kTAM were almost exclusivel...
Theoretical Computer Science, 2017
The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithm... more The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithmic design of DNA self-assembly systems. It employs the use of so-called DNA-tiles, which are abstractions of experimentally achievable DNA nanostructure complexes with similar inter-matching behaviours. To this day, there are about half-dozen different experimental implementations of DNA tiles and their subsequent algorithmic assembly into larger complexes, see e.g. Reif et al. 2012. In order to provide further insight into the assembly process, the aTAM model has been extended to a kinetic counterpart (kTAM). Although there is a wide abundance of different variants of the abstract model, e.g., stage, step, hierarchical, temperature-k, signal-passing, etc. (see e.g. Patitz 2012), numerical simulations of the kinetic counterpart have been performed only for a few types of these systems. This might be due to the fact that the numerical models and simulations of kTAM were almost exclusively implemented using classical stochastic simulation algorithms frameworks, which are not designed for capturing models with theoretically unbounded number of species. In this paper we introduce an agent-and rule-based modeling approach for kTAM, and its implementation on NFsim, one of the available platforms for such type of modelling. We show not only how the modelling of kTAM can be implemented, but we also explore the advantages of this modelling framework for kinetic simulations of kTAM and the easy way such models can be updated and modified. We present numerical comparisons both with classical numerical simulations of kTAM, as well as comparison in between four different kinetic variant of the TAM model, all implemented in NFsim as stand-alone rule-based models. 1. Introduction Recent advances in DNA-based nano-technology have opened the way towards the systematic engineering of inexpensive, nucleic-acid based nano-scale
Physica Medica, 2020
In this study we trained a deep neural network model for female pelvis organ segmentation using d... more In this study we trained a deep neural network model for female pelvis organ segmentation using data from several sites without any personal data sharing. The goal was to assess its prediction power compared with the model trained in a centralized manner. Methods: Varian Learning Portal (VLP) is a distributed machine learning (ML) infrastructure enabling privacypreserving research across hospitals from different regions or countries, within the framework of a trusted consortium. Such a framework is relevant in the case when there is a high level of trust among the participating sites, but there are legal restrictions which do not allow the actual data sharing between them. We trained an organ segmentation model for the female pelvic region using the synchronous data distributed framework provided by the VLP. Results: The prediction performance of the model trained using the federated framework offered by VLP was on the same level as the performance of the model trained in a centralized manner where all training data was pulled together in one centre. Conclusions: VLP infrastructure can be used for GPU-based training of a deep neural network for organ segmentation for the female pelvic region. This organ segmentation instance is particularly difficult due to the high variation in the organs' shape and size. Being able to train the model using data from several clinics can help, for instance, by exposing the model to a larger range of data variations. VLP framework enables such a distributed training approach without sharing protected health information.
Lecture Notes in Computer Science, 2009
Elevated temperatures cause proteins in living cells to misfold. They start forming larger and la... more Elevated temperatures cause proteins in living cells to misfold. They start forming larger and larger aggregates that can eventually lead to the cell’s death. The heat shock response is an evolutionary well conserved cellular response to massive protein misfolding and it is driven by the need to keep the level of misfolded proteins under control. We consider in this paper
ABSTRACT One approach to modelling complex biological systems is to start from an abstract repres... more ABSTRACT One approach to modelling complex biological systems is to start from an abstract representation of the biological process and then to incorporate more details regarding its reactions or reactants through an iterative refinement process. The refinement should be done so as to ensure the preservation of the numerical properties of the model, such as its numerical fit and validation. Such approaches are well established in software engineering: starting from a formal specification of the system, one refines it step-by-step towards an implementation that is guaranteed to satisfy a number of logical properties. We introduce here the concepts of (quantitative) data refinement and process refinement of a biomolecular, reaction-based model. We choose as a case study a recently proposed model for the heat shock response and refine it to include some details of its acetylation-induced control. Although the refinement process produces a substantial increase in the number of kinetic parameters and variables, the methodology we propose preserves all the numerical properties of the model with a minimal computational effort.
Theoretical Computer Science, 2010
When representing DNA molecules as words, it is necessary to take into account the fact that a wo... more When representing DNA molecules as words, it is necessary to take into account the fact that a word u encodes basically the same information as its Watson-Crick complement θ (u), where θ denotes the Watson-Crick complementarity function. Thus, an expression which involves only a word u and its complement can be still considered as a repeating sequence. In this context, we define and investigate the properties of a special class of primitive words, called pseudo-primitive words relative to θ or simply θ-primitive words, which cannot be expressed as such repeating sequences. For instance, we prove the existence of a unique θ-primitive root of a given word, and we give some constraints forcing two distinct words to share their θ-primitive root. Also, we present an extension of the well-known Fine and Wilf theorem, for which we give an optimal bound.
From Logic Systems to Smart Sensors and Actuators, 2012
PLoS ONE, 2014
DNA microarray technologies are used extensively to profile the expression levels of thousands of... more DNA microarray technologies are used extensively to profile the expression levels of thousands of genes under various conditions, yielding extremely large data-matrices. Thus, analyzing this information and extracting biologically relevant knowledge becomes a considerable challenge. A classical approach for tackling this challenge is to use clustering (also known as one-way clustering) methods where genes (or respectively samples) are grouped together based on the similarity of their expression profiles across the set of all samples (or respectively genes). An alternative approach is to develop biclustering methods to identify local patterns in the data. These methods extract subgroups of genes that are co-expressed across only a subset of samples and may feature important biological or medical implications. In this study we evaluate 13 biclustering and 2 clustering (k-means and hierarchical) methods. We use several approaches to compare their performance on two real gene expression data sets. For this purpose we apply four evaluation measures in our analysis: (1) we examine how well the considered (bi)clustering methods differentiate various sample types; (2) we evaluate how well the groups of genes discovered by the (bi)clustering methods are annotated with similar Gene Ontology categories; (3) we evaluate the capability of the methods to differentiate genes that are known to be specific to the particular sample types we study and (4) we compare the running time of the algorithms. In the end, we conclude that as long as the samples are well defined and annotated, the contamination of the samples is limited, and the samples are well replicated, biclustering methods such as Plaid and SAMBA are useful for discovering relevant subsets of genes and samples.
Theoretical Computer Science, 2006
Parallel communicating Watson-Crick automata systems were introduced in [E. Czeizler, E. Czeizler... more Parallel communicating Watson-Crick automata systems were introduced in [E. Czeizler, E. Czeizler, Parallel communicating Watson-Crick automata systems, in: Z. Ésik, Z. Fülöp (Eds.), Proc. Automata and Formal Languages, Dobogókő, Hungary, 2005, pp. 83-96] as possible models of DNA computations. This combination of Watson-Crick automata and parallel communicating systems comes as a natural extension due to the new developments in DNA manipulation techniques. It is already known, see [D. Kuske, P. Weigel, The that for Watson-Crick finite automata, the complementarity relation plays no active role. However, this is not the case when considering parallel communicating Watson-Crick automata systems. In this paper we prove that non-injective complementarity relations increase the accepting power of these systems. We also prove that although Watson-Crick automata are equivalent to two-head finite automata, this equivalence is not preserved when comparing parallel communicating Watson-Crick automata systems and multi-head finite automata.
Theoretical Computer Science, 2005
Although Makanin proved the problem of satisfiability of word equations to be decidable, the gene... more Although Makanin proved the problem of satisfiability of word equations to be decidable, the general structure of solutions is difficult to describe. In particular, Hmelevskii proved that the set of solutions of xyz = zvx cannot be described using only finitely many parameters, contrary to the case of equations in three unknowns. In this paper we give a short, elementary proof of Hmelevskii's result.
Theoretical Computer Science, 2008
In this paper we investigate the maximal size of chains of equations on three or four words such ... more In this paper we investigate the maximal size of chains of equations on three or four words such that every time we add a new equation the set of solutions strictly decreases. We also investigate how large systems of pairwise independent or pairwise non-equivalent equations exist accepting purely non-periodic solutions.
Theoretical Computer Science, 2009
ABSTRACT In this paper, we investigate the open question, formulated in 1983 by Culik II and Karh... more ABSTRACT In this paper, we investigate the open question, formulated in 1983 by Culik II and Karhumäki, asking whether there exist independent systems of three word equations over three unknowns admitting non-periodic solutions. In particular, we answer negatively the above mentioned question for systems in which one of the unknowns occurs at most six times. That is, we show that such systems admit only periodic solutions or they are not independent.
Theoretical Computer Science, 2009
Watson-Crick automata are finite state automata working on double-stranded tapes, introduced to i... more Watson-Crick automata are finite state automata working on double-stranded tapes, introduced to investigate the potential of DNA molecules for computing. In this paper, we continue the investigation of descriptional complexity of Watson-Crick automata initiated by Păun et al. [A. Păun, M. Păun, State and transition complexity of Watson-Crick finite automata, in: G. Ciobanu, G. Paun (Eds.), Fundamentals of Computation Theory, FCT'99, in: LNCS, vol. 1684. In particular, we show that any finite language, as well as any unary regular language, can be recognized by a Watson-Crick automaton with only two, and respectively three, states. Also, we formally define the notion of determinism for these systems. Contrary to the case of non-deterministic Watson-Crick automata, we show that, for deterministic ones, the complementarity relation plays a major role in the acceptance power of these systems.
Int. J. Unconv. Comput., 2018
The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithm... more The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithmic design of DNA self-assembly systems. It employs the use of so-called DNA-tiles, which are abstractions of experimentally achievable DNA nanostructure complexes with similar inter-matching behaviours. To this day, there are about half-dozen different experimental implementations of DNA tiles and their sub-sequent algorithmic assembly into larger complexes, see e.g. Reif et al. 2012. In order to provide further insight into the assembly process, the aTAM model has been extended to a kinetic counterpart (kTAM). Although there is a wide abundance of different variants of the abstract model, e.g., stage, step, hierarchical, temperature-k, signal-passing, etc. (see e.g. Patitz 2012), numerical simulations of the kinetic counterpart have been performed only for a few types of these systems. This might be due to the fact that the numerical models and simulations of kTAM were almost exclusivel...
Theoretical Computer Science, 2017
The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithm... more The (abstract) Tile Assembly Model (aTAM), is a mathematical paradigm for the study and algorithmic design of DNA self-assembly systems. It employs the use of so-called DNA-tiles, which are abstractions of experimentally achievable DNA nanostructure complexes with similar inter-matching behaviours. To this day, there are about half-dozen different experimental implementations of DNA tiles and their subsequent algorithmic assembly into larger complexes, see e.g. Reif et al. 2012. In order to provide further insight into the assembly process, the aTAM model has been extended to a kinetic counterpart (kTAM). Although there is a wide abundance of different variants of the abstract model, e.g., stage, step, hierarchical, temperature-k, signal-passing, etc. (see e.g. Patitz 2012), numerical simulations of the kinetic counterpart have been performed only for a few types of these systems. This might be due to the fact that the numerical models and simulations of kTAM were almost exclusively implemented using classical stochastic simulation algorithms frameworks, which are not designed for capturing models with theoretically unbounded number of species. In this paper we introduce an agent-and rule-based modeling approach for kTAM, and its implementation on NFsim, one of the available platforms for such type of modelling. We show not only how the modelling of kTAM can be implemented, but we also explore the advantages of this modelling framework for kinetic simulations of kTAM and the easy way such models can be updated and modified. We present numerical comparisons both with classical numerical simulations of kTAM, as well as comparison in between four different kinetic variant of the TAM model, all implemented in NFsim as stand-alone rule-based models. 1. Introduction Recent advances in DNA-based nano-technology have opened the way towards the systematic engineering of inexpensive, nucleic-acid based nano-scale
Physica Medica, 2020
In this study we trained a deep neural network model for female pelvis organ segmentation using d... more In this study we trained a deep neural network model for female pelvis organ segmentation using data from several sites without any personal data sharing. The goal was to assess its prediction power compared with the model trained in a centralized manner. Methods: Varian Learning Portal (VLP) is a distributed machine learning (ML) infrastructure enabling privacypreserving research across hospitals from different regions or countries, within the framework of a trusted consortium. Such a framework is relevant in the case when there is a high level of trust among the participating sites, but there are legal restrictions which do not allow the actual data sharing between them. We trained an organ segmentation model for the female pelvic region using the synchronous data distributed framework provided by the VLP. Results: The prediction performance of the model trained using the federated framework offered by VLP was on the same level as the performance of the model trained in a centralized manner where all training data was pulled together in one centre. Conclusions: VLP infrastructure can be used for GPU-based training of a deep neural network for organ segmentation for the female pelvic region. This organ segmentation instance is particularly difficult due to the high variation in the organs' shape and size. Being able to train the model using data from several clinics can help, for instance, by exposing the model to a larger range of data variations. VLP framework enables such a distributed training approach without sharing protected health information.
Lecture Notes in Computer Science, 2009
Elevated temperatures cause proteins in living cells to misfold. They start forming larger and la... more Elevated temperatures cause proteins in living cells to misfold. They start forming larger and larger aggregates that can eventually lead to the cell’s death. The heat shock response is an evolutionary well conserved cellular response to massive protein misfolding and it is driven by the need to keep the level of misfolded proteins under control. We consider in this paper
ABSTRACT One approach to modelling complex biological systems is to start from an abstract repres... more ABSTRACT One approach to modelling complex biological systems is to start from an abstract representation of the biological process and then to incorporate more details regarding its reactions or reactants through an iterative refinement process. The refinement should be done so as to ensure the preservation of the numerical properties of the model, such as its numerical fit and validation. Such approaches are well established in software engineering: starting from a formal specification of the system, one refines it step-by-step towards an implementation that is guaranteed to satisfy a number of logical properties. We introduce here the concepts of (quantitative) data refinement and process refinement of a biomolecular, reaction-based model. We choose as a case study a recently proposed model for the heat shock response and refine it to include some details of its acetylation-induced control. Although the refinement process produces a substantial increase in the number of kinetic parameters and variables, the methodology we propose preserves all the numerical properties of the model with a minimal computational effort.
Theoretical Computer Science, 2010
When representing DNA molecules as words, it is necessary to take into account the fact that a wo... more When representing DNA molecules as words, it is necessary to take into account the fact that a word u encodes basically the same information as its Watson-Crick complement θ (u), where θ denotes the Watson-Crick complementarity function. Thus, an expression which involves only a word u and its complement can be still considered as a repeating sequence. In this context, we define and investigate the properties of a special class of primitive words, called pseudo-primitive words relative to θ or simply θ-primitive words, which cannot be expressed as such repeating sequences. For instance, we prove the existence of a unique θ-primitive root of a given word, and we give some constraints forcing two distinct words to share their θ-primitive root. Also, we present an extension of the well-known Fine and Wilf theorem, for which we give an optimal bound.
From Logic Systems to Smart Sensors and Actuators, 2012
PLoS ONE, 2014
DNA microarray technologies are used extensively to profile the expression levels of thousands of... more DNA microarray technologies are used extensively to profile the expression levels of thousands of genes under various conditions, yielding extremely large data-matrices. Thus, analyzing this information and extracting biologically relevant knowledge becomes a considerable challenge. A classical approach for tackling this challenge is to use clustering (also known as one-way clustering) methods where genes (or respectively samples) are grouped together based on the similarity of their expression profiles across the set of all samples (or respectively genes). An alternative approach is to develop biclustering methods to identify local patterns in the data. These methods extract subgroups of genes that are co-expressed across only a subset of samples and may feature important biological or medical implications. In this study we evaluate 13 biclustering and 2 clustering (k-means and hierarchical) methods. We use several approaches to compare their performance on two real gene expression data sets. For this purpose we apply four evaluation measures in our analysis: (1) we examine how well the considered (bi)clustering methods differentiate various sample types; (2) we evaluate how well the groups of genes discovered by the (bi)clustering methods are annotated with similar Gene Ontology categories; (3) we evaluate the capability of the methods to differentiate genes that are known to be specific to the particular sample types we study and (4) we compare the running time of the algorithms. In the end, we conclude that as long as the samples are well defined and annotated, the contamination of the samples is limited, and the samples are well replicated, biclustering methods such as Plaid and SAMBA are useful for discovering relevant subsets of genes and samples.
Theoretical Computer Science, 2006
Parallel communicating Watson-Crick automata systems were introduced in [E. Czeizler, E. Czeizler... more Parallel communicating Watson-Crick automata systems were introduced in [E. Czeizler, E. Czeizler, Parallel communicating Watson-Crick automata systems, in: Z. Ésik, Z. Fülöp (Eds.), Proc. Automata and Formal Languages, Dobogókő, Hungary, 2005, pp. 83-96] as possible models of DNA computations. This combination of Watson-Crick automata and parallel communicating systems comes as a natural extension due to the new developments in DNA manipulation techniques. It is already known, see [D. Kuske, P. Weigel, The that for Watson-Crick finite automata, the complementarity relation plays no active role. However, this is not the case when considering parallel communicating Watson-Crick automata systems. In this paper we prove that non-injective complementarity relations increase the accepting power of these systems. We also prove that although Watson-Crick automata are equivalent to two-head finite automata, this equivalence is not preserved when comparing parallel communicating Watson-Crick automata systems and multi-head finite automata.
Theoretical Computer Science, 2005
Although Makanin proved the problem of satisfiability of word equations to be decidable, the gene... more Although Makanin proved the problem of satisfiability of word equations to be decidable, the general structure of solutions is difficult to describe. In particular, Hmelevskii proved that the set of solutions of xyz = zvx cannot be described using only finitely many parameters, contrary to the case of equations in three unknowns. In this paper we give a short, elementary proof of Hmelevskii's result.
Theoretical Computer Science, 2008
In this paper we investigate the maximal size of chains of equations on three or four words such ... more In this paper we investigate the maximal size of chains of equations on three or four words such that every time we add a new equation the set of solutions strictly decreases. We also investigate how large systems of pairwise independent or pairwise non-equivalent equations exist accepting purely non-periodic solutions.
Theoretical Computer Science, 2009
ABSTRACT In this paper, we investigate the open question, formulated in 1983 by Culik II and Karh... more ABSTRACT In this paper, we investigate the open question, formulated in 1983 by Culik II and Karhumäki, asking whether there exist independent systems of three word equations over three unknowns admitting non-periodic solutions. In particular, we answer negatively the above mentioned question for systems in which one of the unknowns occurs at most six times. That is, we show that such systems admit only periodic solutions or they are not independent.
Theoretical Computer Science, 2009
Watson-Crick automata are finite state automata working on double-stranded tapes, introduced to i... more Watson-Crick automata are finite state automata working on double-stranded tapes, introduced to investigate the potential of DNA molecules for computing. In this paper, we continue the investigation of descriptional complexity of Watson-Crick automata initiated by Păun et al. [A. Păun, M. Păun, State and transition complexity of Watson-Crick finite automata, in: G. Ciobanu, G. Paun (Eds.), Fundamentals of Computation Theory, FCT'99, in: LNCS, vol. 1684. In particular, we show that any finite language, as well as any unary regular language, can be recognized by a Watson-Crick automaton with only two, and respectively three, states. Also, we formally define the notion of determinism for these systems. Contrary to the case of non-deterministic Watson-Crick automata, we show that, for deterministic ones, the complementarity relation plays a major role in the acceptance power of these systems.