Mihaela Breaban | Universitatea Alexandru Ioan Cuza Iasi (original) (raw)

Papers by Mihaela Breaban

Research paper thumbnail of A New Scheme of Using Inference Inside Evolutionary Computation Techniques to Solve CSPs

2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2006

There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches ... more There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches and search algo-rithms [4]. Inference methods derive and record new infor-mation in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of Incorporating Inference into Evolutionary Algorithms for Max-CSP

Lecture Notes in Computer Science, 2006

There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches a... more There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches and search algorithms [1]. Inference approaches derive and record new information in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of Evolutionary Computation in Constraint Satisfaction

The chapter presents some of the techniques based on Evolutionary Computation paradigms for solvi... more The chapter presents some of the techniques based on Evolutionary Computation paradigms for solving constraints satisfaction problems. Two hybrid approaches based on the idea of using the heuristics extracted from an inference algorithm inside evolutionary computation paradigms are detailed. The effect of combining inference with randomized search was studied by exploiting the advantage of adaptable inference levels offered by the Mini-Bucket Elimination algorithm. Tests conducted on binary CSPs against a Branch and Bound algorithm show that the systematic search has more benefit from inference than the randomized search performed by evolutionary computation paradigms. However, on hard CSP instances the Branch and Bound algorithm requires higher levels of inference which imply a much greater computational cost in order to compete with evolutionary computation methods.

Research paper thumbnail of Performance Evaluation of Ant Colony Systems for the Single-Depot Multiple Traveling Salesman Problem

Lecture Notes in Computer Science, 2015

Derived from the well-known Traveling Salesman problem (TSP), the multiple-Traveling Salesman pro... more Derived from the well-known Traveling Salesman problem (TSP), the multiple-Traveling Salesman problem (multiple-TSP) with single depot is a straightforward generalization: several salesmen located in a given city (the depot) need to visit a set of interconnected cities, such that each city is visited exactly once (by a single salesman) while the total cost of their tours is minimized. Designed for shortest path problems and with proven efficiency for TSP, Ant Colony Systems (ACS) are a natural choice for multiple-TSP as well. Although several variations of ant algorithms for multiple-TSP are reported in the literature, there is no clear evidence on their comparative performance. The contribution of this paper is twofold: it provides a benchmark for single-depot-multiple-TSP with reported optima and performs a thorough experimental evaluation of several variations of the ACS on this problem.

Research paper thumbnail of Shaping Up Clusters with PSO

2008 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2008

Abstract—This paper presents a method for enhancing the performance of current clustering algorit... more Abstract—This paper presents a method for enhancing the performance of current clustering algorithms; the method is based on Particle Swarm Optimization techniques. Namely, a pre-processing step aims at bringing ”closer” objects which are likely to belong to the same cluster, while ...

Research paper thumbnail of Nonlinear Feature Extraction in a Logarithmic Space with Evolutionary Algorithms

Annals of West University of Timisoara - Mathematics, 2013

The current paper presents a method to deliver nonlinear projections of a data set that discrimin... more The current paper presents a method to deliver nonlinear projections of a data set that discriminate between existing labeled groups of data items. Inspired from traditional linear Projection Pursuit and Linear Discriminant Analysis, the new method seeks nonlinear combinations of attributes as polynomials that maximize Fisher's criterion. The search for the monomials in a polynomial is conducted in a logarithmic space in order to reduce computational complexity. The selection of monomials and the optimization of weights that conduct to the nonlinear projection are performed with a multi-modal Genetic Algorithm hybridized with Differential Evolution. By alleviating the drawbacks driven from the linearity assumptions in traditional Projection Pursuit, the new method could gain a wide applicability in both unsupervised and supervised data analysis.

Research paper thumbnail of A New Scheme of Using Inference Inside Evolutionary Computation Techniques to Solve CSPs

2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2006

There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches ... more There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches and search algo-rithms [4]. Inference methods derive and record new infor-mation in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of PSO aided k-means clustering

Proceedings of the 13th annual conference on Genetic and evolutionary computation - GECCO '11, 2011

Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objecti... more Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objective perspective, this paper combines principles from two different clustering paradigms: the connectivity principle from density-based methods is integrated into the partitional clustering approach. The standard k-Means algorithm is hybridized with Particle Swarm Optimization. The new method (PSO-kMeans) benefits from both a local and a global

Research paper thumbnail of Optimized Ensembles for Clustering Noisy Data

Lecture Notes in Computer Science, 2010

Abstract. Clustering analysis is an important step towards getting in-sight into new data. Ensemb... more Abstract. Clustering analysis is an important step towards getting in-sight into new data. Ensemble procedures have been designed in order to obtain improved partitions of a data set. Previous work in domain, mostly empirical, shows that accuracy and a limited diversity are manda- ...

Research paper thumbnail of Unsupervised feature weighting with multi niche crowding genetic algorithms

Proceedings of the 11th Annual conference on Genetic and evolutionary computation - GECCO '09, 2009

... Crowding Genetic Algorithms Mihaela Breaban Faculty of Computer Science Alexandru Ioan Cuza U... more ... Crowding Genetic Algorithms Mihaela Breaban Faculty of Computer Science Alexandru Ioan Cuza University Iasi, Romania pmihaela@infoiasi.ro Henri Luchian Faculty of Computer Science Alexandru Ioan Cuza University Iasi, Romania hluchian@infoiasi.ro ...

Research paper thumbnail of A Genetic Clustering Algorithm by Monomial Projection Pursuit

2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2012

ABSTRACT This paper proposes a new method to identify interesting structures in data based on the... more ABSTRACT This paper proposes a new method to identify interesting structures in data based on the projection pursuit methodology. Past work reported in literature uses projection pursuit methods as means to visualize high-dimensional data, or to identify linear combinations of attributes that reveal grouping tendencies or outliers. The framework of projection pursuit is generally formulated as an optimization problem aiming at finding projection axes that minimize/maximize a projection index. With regard to identifying interesting structure, the existing approaches suffer from obvious limitations: linear models are not able to catch more general structures in data like circular/curved clusters or any structure that is the result of a polynomial/nonlinear generative model. This paper extends linear projection pursuit to nonlinear projections while allowing at the same time for the preservation of the general methodology employed in the search of projections. In addition, an algorithmic framework based on multi-modal genetic algorithms is proposed in order to deal with the large search space and to allow for the use of non-differentiable projection indices. Experiments conducted on synthetic data demonstrate the ability of the new approach to identify clusters of various shapes that otherwise are undetectable with linear projection pursuit or popular clustering methods like k-Means.

Research paper thumbnail of Guiding users within trust networks using swarm algorithms

2009 IEEE Congress on Evolutionary Computation, 2009

Abstract This paper is concerned with a problem in information organization and retrieval within ... more Abstract This paper is concerned with a problem in information organization and retrieval within Web communities. Most work in this domain is focused on reputation-based systems which exploit the experience gathered by previous users in order to evaluate resources at the community level. The current research focuses on a slightly different approach: a personalized evaluation system whose goal is to build a flexible and easy way to manage resources in a personalized manner. The functionality of such a model comes from local ...

Research paper thumbnail of Incorporating Inference into Evolutionary Algorithms for Max-CSP

Lecture Notes in Computer Science, 2006

There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches a... more There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches and search algorithms [1]. Inference approaches derive and record new information in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of Evolutionary Computation in Constraint Satisfaction

New Achievements in Evolutionary Computation, 2010

Source: New Achievements in Evolutionary Computation, Book edited by: Peter Korosec, ISBN 978-953... more Source: New Achievements in Evolutionary Computation, Book edited by: Peter Korosec, ISBN 978-953-307-053-7, pp. 318, February 2010, INTECH, Croatia, downloaded from SCIYO.COM

Research paper thumbnail of A unifying criterion for unsupervised clustering and feature selection

Pattern Recognition, 2011

Exploratory data analysis methods are essential for getting insight into data. Identifying the mo... more Exploratory data analysis methods are essential for getting insight into data. Identifying the most important variables and detecting quasi-homogenous groups of data are problems of interest in this context. Solving such problems is a difficult task, mainly due to the unsupervised nature of the underlying learning process. Unsupervised feature selection and unsupervised clustering can be successfully approached as optimization problems

Research paper thumbnail of Using support vector regression to estimate sonic log distributions: A case study from the Anadarko Basin, Oklahoma

Journal of Petroleum Science and Engineering, 2013

ABSTRACT In petroleum industry, the compressional acoustic or sonic log (DT) is commonly used as ... more ABSTRACT In petroleum industry, the compressional acoustic or sonic log (DT) is commonly used as a predictor because its capabilities respond to changes in porosity or compaction which, in turn, are further used to estimate formation (sonic) porosity, to map abnormal pore-fluid pressure, or to carry out petrophysical studies. Despite its intrinsic capabilities, the sonic log is not routinely recorded in during well logging. We propose using a method belonging to the class of supervised machine learning algorithms — Support Vector Regression (SVR) — to synthesize missing compressional acoustic or sonic (DT) logs when only common logs (e.g., natural gamma ray—GR, or deep resistivity—REID) are available.Our approach involves three steps: (1) supervised training of the model; (2) confirmation and validation of the model by blind-testing the results in wells containing both the predictor (GR, REID) and the target (DT) values used in the supervised training; and (3) application of the predicted model to wells containing the predictor data and obtaining the synthetic (simulated) DT log. SVR methodology offers two advantages over traditional deterministic methods: strong nonlinear approximation capabilities and good generalization effectiveness. These result from the use of kernel functions and from the structural risk minimization principle behind SVR. Unlike linear regression techniques, SVR does not overpredict mean values and thereby preserves original data variability. SVR also deals greatly with uncertainty associated with the data, the immense size of the data and the diversity of the data type. A case study from the Anadarko Basin, Oklahoma, about estimating the presence of abnormally pressurized pore-fluid zones by using synthesized DT values, is presented. The results are promising and encouraging.

Research paper thumbnail of Tackling the Bi-criteria Facet of Multiple Traveling Salesman Problem with Ant Colony Systems

The single-depot multiple TSP (SD-MTSP) is a simple extension of the standard TSP, in which more ... more The single-depot multiple TSP (SD-MTSP) is a simple extension of the standard TSP, in which more than one
salesman is allowed to visit the set of interconnected cities, such that each city is visited exactly once (by a single salesman) and the total cost of the traveled subtours is minimized. Although Ant Colony Systems (ACSs) are a natural choice for shortest-path problems, with TSP at its core, the application of ACS on this straightforward extension is not properly explored. The reasons may lie in the bi-criteria nature of the problem (shortest cost versus balanced subtours) and the lack of dedicated benchmarks exposing optimal solutions. This paper attempts at proposing and evaluating from a bi-criteria perspective several multiobjective ACSs to tackle SD-MTSP when two objectives need to be simultaneously optimized: minimizing the total cost of traveled subtours while achieving balanced subtours. Experiments are conducted towards investigating the efficiency of the algorithms in a multi-objective setting.

Research paper thumbnail of PSO Aided k-Means Clustering: Introducing Connectivity in k-Means

Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objecti... more Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objective perspective, this paper combines principles from two different clustering paradigms: the connectivity principle from density-based methods is integrated into the partitional clustering approach. The standard k-Means algorithm is hybridized with Particle Swarm Optimization. The new method (PSO-kMeans) benefits from both a local and a global view on data and alleviates some drawbacks of the k-Means algorithm; thus, it is able to spot types of clusters which are otherwise difficult to obtain (elongated shapes, non-similar volumes). Our experimental results show that PSO-kMeans improves the performance of standard k-Means in all test cases and performs at least comparable to state-of-the-art methods in the worst case. PSO-kMeans is robust to outliers. This comes at a cost: the preprocessing step for finding the nearest neighbors for each data item is required, which increases the initial linear complexity of k-Means to quadratic complexity.

Research paper thumbnail of Evolving Ensembles of Feature Subsets towards Optimal Feature Selection for Unsupervised and Semi-supervised Clustering

The work in unsupervised learning centered on clustering has been extended with new paradigms to ... more The work in unsupervised learning centered on clustering has been extended with new paradigms to address the demands raised by real-world problems. In this regard, unsupervised feature selection has been proposed to remove noisy attributes that could mislead the clustering procedures. Additionally, semi-supervision has been integrated within existing paradigms because some background information usually exist in form of a reduced number of similarity/dissimilarity constraints. In this context, the current paper investigates a method to perform simultaneously feature selection and clustering. The benefits of a semi-supervised approach making use of reduced external information are highlighted against an unsupervised approach. The method makes use of an ensemble of near-optimal feature subsets delivered by a multi-modal genetic algorithm in order to quantify the relative importance of each feature to clustering.

Research paper thumbnail of Evolutionary Computation in Constraint Satisfaction, book chapter in New Achievements in Evolutionary Computation

Research paper thumbnail of A New Scheme of Using Inference Inside Evolutionary Computation Techniques to Solve CSPs

2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2006

There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches ... more There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches and search algo-rithms [4]. Inference methods derive and record new infor-mation in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of Incorporating Inference into Evolutionary Algorithms for Max-CSP

Lecture Notes in Computer Science, 2006

There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches a... more There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches and search algorithms [1]. Inference approaches derive and record new information in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of Evolutionary Computation in Constraint Satisfaction

The chapter presents some of the techniques based on Evolutionary Computation paradigms for solvi... more The chapter presents some of the techniques based on Evolutionary Computation paradigms for solving constraints satisfaction problems. Two hybrid approaches based on the idea of using the heuristics extracted from an inference algorithm inside evolutionary computation paradigms are detailed. The effect of combining inference with randomized search was studied by exploiting the advantage of adaptable inference levels offered by the Mini-Bucket Elimination algorithm. Tests conducted on binary CSPs against a Branch and Bound algorithm show that the systematic search has more benefit from inference than the randomized search performed by evolutionary computation paradigms. However, on hard CSP instances the Branch and Bound algorithm requires higher levels of inference which imply a much greater computational cost in order to compete with evolutionary computation methods.

Research paper thumbnail of Performance Evaluation of Ant Colony Systems for the Single-Depot Multiple Traveling Salesman Problem

Lecture Notes in Computer Science, 2015

Derived from the well-known Traveling Salesman problem (TSP), the multiple-Traveling Salesman pro... more Derived from the well-known Traveling Salesman problem (TSP), the multiple-Traveling Salesman problem (multiple-TSP) with single depot is a straightforward generalization: several salesmen located in a given city (the depot) need to visit a set of interconnected cities, such that each city is visited exactly once (by a single salesman) while the total cost of their tours is minimized. Designed for shortest path problems and with proven efficiency for TSP, Ant Colony Systems (ACS) are a natural choice for multiple-TSP as well. Although several variations of ant algorithms for multiple-TSP are reported in the literature, there is no clear evidence on their comparative performance. The contribution of this paper is twofold: it provides a benchmark for single-depot-multiple-TSP with reported optima and performs a thorough experimental evaluation of several variations of the ACS on this problem.

Research paper thumbnail of Shaping Up Clusters with PSO

2008 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2008

Abstract—This paper presents a method for enhancing the performance of current clustering algorit... more Abstract—This paper presents a method for enhancing the performance of current clustering algorithms; the method is based on Particle Swarm Optimization techniques. Namely, a pre-processing step aims at bringing ”closer” objects which are likely to belong to the same cluster, while ...

Research paper thumbnail of Nonlinear Feature Extraction in a Logarithmic Space with Evolutionary Algorithms

Annals of West University of Timisoara - Mathematics, 2013

The current paper presents a method to deliver nonlinear projections of a data set that discrimin... more The current paper presents a method to deliver nonlinear projections of a data set that discriminate between existing labeled groups of data items. Inspired from traditional linear Projection Pursuit and Linear Discriminant Analysis, the new method seeks nonlinear combinations of attributes as polynomials that maximize Fisher's criterion. The search for the monomials in a polynomial is conducted in a logarithmic space in order to reduce computational complexity. The selection of monomials and the optimization of weights that conduct to the nonlinear projection are performed with a multi-modal Genetic Algorithm hybridized with Differential Evolution. By alleviating the drawbacks driven from the linearity assumptions in traditional Projection Pursuit, the new method could gain a wide applicability in both unsupervised and supervised data analysis.

Research paper thumbnail of A New Scheme of Using Inference Inside Evolutionary Computation Techniques to Solve CSPs

2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2006

There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches ... more There are two major ways to solve constraint satisfaction problems (CSPs) : inference approaches and search algo-rithms [4]. Inference methods derive and record new infor-mation in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of PSO aided k-means clustering

Proceedings of the 13th annual conference on Genetic and evolutionary computation - GECCO '11, 2011

Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objecti... more Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objective perspective, this paper combines principles from two different clustering paradigms: the connectivity principle from density-based methods is integrated into the partitional clustering approach. The standard k-Means algorithm is hybridized with Particle Swarm Optimization. The new method (PSO-kMeans) benefits from both a local and a global

Research paper thumbnail of Optimized Ensembles for Clustering Noisy Data

Lecture Notes in Computer Science, 2010

Abstract. Clustering analysis is an important step towards getting in-sight into new data. Ensemb... more Abstract. Clustering analysis is an important step towards getting in-sight into new data. Ensemble procedures have been designed in order to obtain improved partitions of a data set. Previous work in domain, mostly empirical, shows that accuracy and a limited diversity are manda- ...

Research paper thumbnail of Unsupervised feature weighting with multi niche crowding genetic algorithms

Proceedings of the 11th Annual conference on Genetic and evolutionary computation - GECCO '09, 2009

... Crowding Genetic Algorithms Mihaela Breaban Faculty of Computer Science Alexandru Ioan Cuza U... more ... Crowding Genetic Algorithms Mihaela Breaban Faculty of Computer Science Alexandru Ioan Cuza University Iasi, Romania pmihaela@infoiasi.ro Henri Luchian Faculty of Computer Science Alexandru Ioan Cuza University Iasi, Romania hluchian@infoiasi.ro ...

Research paper thumbnail of A Genetic Clustering Algorithm by Monomial Projection Pursuit

2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2012

ABSTRACT This paper proposes a new method to identify interesting structures in data based on the... more ABSTRACT This paper proposes a new method to identify interesting structures in data based on the projection pursuit methodology. Past work reported in literature uses projection pursuit methods as means to visualize high-dimensional data, or to identify linear combinations of attributes that reveal grouping tendencies or outliers. The framework of projection pursuit is generally formulated as an optimization problem aiming at finding projection axes that minimize/maximize a projection index. With regard to identifying interesting structure, the existing approaches suffer from obvious limitations: linear models are not able to catch more general structures in data like circular/curved clusters or any structure that is the result of a polynomial/nonlinear generative model. This paper extends linear projection pursuit to nonlinear projections while allowing at the same time for the preservation of the general methodology employed in the search of projections. In addition, an algorithmic framework based on multi-modal genetic algorithms is proposed in order to deal with the large search space and to allow for the use of non-differentiable projection indices. Experiments conducted on synthetic data demonstrate the ability of the new approach to identify clusters of various shapes that otherwise are undetectable with linear projection pursuit or popular clustering methods like k-Means.

Research paper thumbnail of Guiding users within trust networks using swarm algorithms

2009 IEEE Congress on Evolutionary Computation, 2009

Abstract This paper is concerned with a problem in information organization and retrieval within ... more Abstract This paper is concerned with a problem in information organization and retrieval within Web communities. Most work in this domain is focused on reputation-based systems which exploit the experience gathered by previous users in order to evaluate resources at the community level. The current research focuses on a slightly different approach: a personalized evaluation system whose goal is to build a flexible and easy way to manage resources in a personalized manner. The functionality of such a model comes from local ...

Research paper thumbnail of Incorporating Inference into Evolutionary Algorithms for Max-CSP

Lecture Notes in Computer Science, 2006

There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches a... more There are two major ways to solve constraint satisfaction problems(CSPs) : inference approaches and search algorithms [1]. Inference approaches derive and record new information in order to make the problem easier to solve. Search algorithms seek for a solution in the space of ...

Research paper thumbnail of Evolutionary Computation in Constraint Satisfaction

New Achievements in Evolutionary Computation, 2010

Source: New Achievements in Evolutionary Computation, Book edited by: Peter Korosec, ISBN 978-953... more Source: New Achievements in Evolutionary Computation, Book edited by: Peter Korosec, ISBN 978-953-307-053-7, pp. 318, February 2010, INTECH, Croatia, downloaded from SCIYO.COM

Research paper thumbnail of A unifying criterion for unsupervised clustering and feature selection

Pattern Recognition, 2011

Exploratory data analysis methods are essential for getting insight into data. Identifying the mo... more Exploratory data analysis methods are essential for getting insight into data. Identifying the most important variables and detecting quasi-homogenous groups of data are problems of interest in this context. Solving such problems is a difficult task, mainly due to the unsupervised nature of the underlying learning process. Unsupervised feature selection and unsupervised clustering can be successfully approached as optimization problems

Research paper thumbnail of Using support vector regression to estimate sonic log distributions: A case study from the Anadarko Basin, Oklahoma

Journal of Petroleum Science and Engineering, 2013

ABSTRACT In petroleum industry, the compressional acoustic or sonic log (DT) is commonly used as ... more ABSTRACT In petroleum industry, the compressional acoustic or sonic log (DT) is commonly used as a predictor because its capabilities respond to changes in porosity or compaction which, in turn, are further used to estimate formation (sonic) porosity, to map abnormal pore-fluid pressure, or to carry out petrophysical studies. Despite its intrinsic capabilities, the sonic log is not routinely recorded in during well logging. We propose using a method belonging to the class of supervised machine learning algorithms — Support Vector Regression (SVR) — to synthesize missing compressional acoustic or sonic (DT) logs when only common logs (e.g., natural gamma ray—GR, or deep resistivity—REID) are available.Our approach involves three steps: (1) supervised training of the model; (2) confirmation and validation of the model by blind-testing the results in wells containing both the predictor (GR, REID) and the target (DT) values used in the supervised training; and (3) application of the predicted model to wells containing the predictor data and obtaining the synthetic (simulated) DT log. SVR methodology offers two advantages over traditional deterministic methods: strong nonlinear approximation capabilities and good generalization effectiveness. These result from the use of kernel functions and from the structural risk minimization principle behind SVR. Unlike linear regression techniques, SVR does not overpredict mean values and thereby preserves original data variability. SVR also deals greatly with uncertainty associated with the data, the immense size of the data and the diversity of the data type. A case study from the Anadarko Basin, Oklahoma, about estimating the presence of abnormally pressurized pore-fluid zones by using synthesized DT values, is presented. The results are promising and encouraging.

Research paper thumbnail of Tackling the Bi-criteria Facet of Multiple Traveling Salesman Problem with Ant Colony Systems

The single-depot multiple TSP (SD-MTSP) is a simple extension of the standard TSP, in which more ... more The single-depot multiple TSP (SD-MTSP) is a simple extension of the standard TSP, in which more than one
salesman is allowed to visit the set of interconnected cities, such that each city is visited exactly once (by a single salesman) and the total cost of the traveled subtours is minimized. Although Ant Colony Systems (ACSs) are a natural choice for shortest-path problems, with TSP at its core, the application of ACS on this straightforward extension is not properly explored. The reasons may lie in the bi-criteria nature of the problem (shortest cost versus balanced subtours) and the lack of dedicated benchmarks exposing optimal solutions. This paper attempts at proposing and evaluating from a bi-criteria perspective several multiobjective ACSs to tackle SD-MTSP when two objectives need to be simultaneously optimized: minimizing the total cost of traveled subtours while achieving balanced subtours. Experiments are conducted towards investigating the efficiency of the algorithms in a multi-objective setting.

Research paper thumbnail of PSO Aided k-Means Clustering: Introducing Connectivity in k-Means

Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objecti... more Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objective perspective, this paper combines principles from two different clustering paradigms: the connectivity principle from density-based methods is integrated into the partitional clustering approach. The standard k-Means algorithm is hybridized with Particle Swarm Optimization. The new method (PSO-kMeans) benefits from both a local and a global view on data and alleviates some drawbacks of the k-Means algorithm; thus, it is able to spot types of clusters which are otherwise difficult to obtain (elongated shapes, non-similar volumes). Our experimental results show that PSO-kMeans improves the performance of standard k-Means in all test cases and performs at least comparable to state-of-the-art methods in the worst case. PSO-kMeans is robust to outliers. This comes at a cost: the preprocessing step for finding the nearest neighbors for each data item is required, which increases the initial linear complexity of k-Means to quadratic complexity.

Research paper thumbnail of Evolving Ensembles of Feature Subsets towards Optimal Feature Selection for Unsupervised and Semi-supervised Clustering

The work in unsupervised learning centered on clustering has been extended with new paradigms to ... more The work in unsupervised learning centered on clustering has been extended with new paradigms to address the demands raised by real-world problems. In this regard, unsupervised feature selection has been proposed to remove noisy attributes that could mislead the clustering procedures. Additionally, semi-supervision has been integrated within existing paradigms because some background information usually exist in form of a reduced number of similarity/dissimilarity constraints. In this context, the current paper investigates a method to perform simultaneously feature selection and clustering. The benefits of a semi-supervised approach making use of reduced external information are highlighted against an unsupervised approach. The method makes use of an ensemble of near-optimal feature subsets delivered by a multi-modal genetic algorithm in order to quantify the relative importance of each feature to clustering.

Research paper thumbnail of Evolutionary Computation in Constraint Satisfaction, book chapter in New Achievements in Evolutionary Computation