Carlos Cobos | University of Cauca (original) (raw)

Papers by Carlos Cobos

Research paper thumbnail of Covering Arrays to Support the Process of Feature Selection in the Random Forest Classifier

Lecture Notes in Computer Science, 2019

The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from... more The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from Bootstrap subsets of the original dataset. Each subset is a sample of instances (rows) by a random subset of features (variables or columns) of the original dataset to be classified. In RF, pruning is not applied in the generation of base trees and in the classification process of a new record, each tree issues a vote enabling the selected class to be defined, as that with the most votes. Bearing in mind that in the state of the art it is defined that random feature selection for constructing the Bootstrap subsets decreases the quality of the results achieved with RF, in this work the integration of covering arrays (CA) in RF is proposed to solve this situation, in an algorithm called RFCA. In RFCA, the number N of rows of the CA defines the lowest number of base trees that require to be generated in RF and each row of the CA defines the features that each Bootstrap subset will use in the creation of each tree. To evaluate the new proposal, 32 datasets available in the UCI repository are used and compared with the RF available in Weka. The experiments show that the use of a CA of strength 2 to 7 obtains promising results in terms of accuracy.

Research paper thumbnail of Automatic Generation of Multi-document Summaries Based on the Global-Best Harmony Search Metaheuristic and the LexRank Graph-Based Algorithm

Lecture Notes in Computer Science, 2018

Recently, metaheuristic based algorithms have shown good results in generating automatic multi-do... more Recently, metaheuristic based algorithms have shown good results in generating automatic multi-document summaries. This paper proposes two algorithms that hybridize the metaheuristic of Global Best Harmony Search and the LexRank Graph based algorithm, called LexGbhs and GbhsLex. The objective function to be optimized is composed of the features of coverage and diversity. Coverage measures the similarity between each sentence of the candidate summary and the centroid of the sentences of the collection of documents, while diversity measures how different the sentences that make up a candidate summary are. The two proposed hybrid algorithms were compared with state of the art algorithms using ROUGE-1, ROUGE-2 and ROUGE-SU4 measurements for the DUC2005 and DUC2006 data sets. After a unified classification was carried out, the LexGbhs algorithm proposed ranked third, showing that the hybridization of metaheuristics with graphs in the generation of extractive summaries of multiple documents is a promising line of research.

Research paper thumbnail of Finding Optimal Farming Practices to Increase Crop Yield Through Global-Best Harmony Search and Predictive Models, a Data-Driven Approach

Lecture Notes in Computer Science, 2018

Research paper thumbnail of Acercamiento a las buenas prácticas para el desarrollo de software basado en DevOps y SCRUM utilizadas en empresas muy pequeñas

Revista Facultad de Ingeniería

Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un ... more Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un limitado flujo de caja y tiempo para implementar mejoras en sus procesos que les permita ser más competitivos. Esta es una de las razones por las que estas empresas recurren a la implementación de marcos de trabajo ágil como SCRUM para gestionar el proceso de desarrollo de software. Pero cuando inician su adopción, encuentran que los documentos solo sugieren los cambios que se pueden realizar, pero no como hacerlos, tornando el proceso de descubrir cuales técnicas, eventos y artefactos son los que deben implementar en un enfoque de prueba y error costoso y en algunos casos inviable. Lo mismo sucede con otros marcos que pueden ser complementarios a SCRUM como DevOps, que propone un acercamiento entre el área de desarrollo y operaciones, donde se automaticen la mayor cantidad de tareas y se incrementen los controles de calidad para obtener mejores productos. Este artículo expone tres buena...

Research paper thumbnail of Predicción del rendimiento de cultivos de café: un mapeo sistemático

Ingeniería y Competitividad, Sep 19, 2023

Research paper thumbnail of Grouping of business processes models based on an incremental clustering algorithm using fuzzy similarity and multimodal search

Expert Systems With Applications, 2017

A model for searching business processes, based on a multimodal approach that integrates textual ... more A model for searching business processes, based on a multimodal approach that integrates textual and structural information.A clustering mechanism that uses a similarity function based on fuzzy logic for grouping search results.Evaluation of search method using internal quality assessment and external assessment based on human criteria. Nowadays, many companies standardize their operations through Business Process (BP), which are stored in repositories and reused when new functionalities are required. However, finding specific processes may become a cumbersome task due to the large size of these repositories. This paper presents MulTimodalGroup, a model for grouping and searching business processes. The grouping mechanism is built upon a clustering algorithm that uses a similarity function based on fuzzy logic; this grouping is performed using the results of each user request. By its part, the search is based on a multimodal representation that integrates textual and structural information of BP. The assessment of the proposed model was carried out in two phases: 1) internal quality assessment of groups and 2) external assessment of the created groups compared with an ideal set of groups. The assessment was performed using a closed BP collection designed collaboratively by 59 experts. The experimental results in each phase are promising and evidence the validity of the proposed model.

Research paper thumbnail of Multiobjective Memetic GRASP to Solve Vehicle Routing Problems with Time Windows Size

International Journal on Advanced Science, Engineering and Information Technology, Aug 31, 2022

Research paper thumbnail of Teaching Guide for Beginnings in DevOps and Continuous Delivery in AWS Focused on the Society 5.0 Skillset

Revista Iberoamericana De Tecnologías Del Aprendizaje, Nov 1, 2022

Research paper thumbnail of Documenting and implementing DevOps good practices with test automation and continuous deployment tools through software refinement

Periodicals of Engineering and Natural Sciences (PEN), Nov 3, 2021

Research paper thumbnail of TrazasBP: A Framework for Business Process Models Discovery Based on Execution Cases

Research paper thumbnail of A Multi-Objective Approach for the Calibration of Microscopic Traffic Flow Simulation Models

Research paper thumbnail of Vegetation Index Based on Genetic Programming for Bare Ground Detection in the Amazon

Lecture Notes in Computer Science, 2018

Research paper thumbnail of Memetic Algorithm Based on Global-Best Harmony Search and Hill Climbing for Part of Speech Tagging

Lecture Notes in Computer Science, 2017

The task of assigning tags to the words of a sentence has many applications today in natural lang... more The task of assigning tags to the words of a sentence has many applications today in natural language processing (NLP) and therefore requires a fast and accurate algorithm. This paper presents a Part-of-Speech Tagger based on Global-Best Harmony Search (GBHS) which includes local optimization (based on the Hill Climbing algorithm that includes knowledge of the problem to define the neighborhood) for the best harmony after each improvisation (iteration). In the proposed algorithm, a candidate solution (harmony) is represented as a vector of the size of the numbers of word in a sentence, while the fitness function considers the cumulative probability of tagging each word and its relation to its predecessor and successor word. The proposed algorithm obtained 95.2% precision values and improved on the results obtained by other taggers. The experimental results were analyzed with Friedman non-parametric statistical tests, with a level of significance of 90%. The proposed Part-of-Speech Tagger algorithm was found to perform with quality and efficiency in the tagging problem, in contrast to the comparison algorithms. The Brown corpus divided into 5 folders was used to conduct the experiments, thereby allowing application of cross-validation.

Research paper thumbnail of Optimization of Neural Network Training with ELM Based on the Iterative Hybridization of Differential Evolution with Local Search and Restarts

Lecture Notes in Computer Science, 2019

An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural netw... more An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural network (SLFN) in less time than the back-propagation algorithm. An ELM defines the input weights and biases of the hidden layer with random values, and then analytically calculates the output weights. The use of random values causes SLFN performance to decrease significantly. The present work carries out the adaptation of three continuous optimization algorithms of high dimensionality (IHDELS, DECC-G and MOS) and compares their performance to each other and with the state-of-the-art method, a memetic algorithm based on differential evolution called M-ELM. The results of the comparison show that IHDELS using a validation model based on retention (Training/Testing) obtains the best results, followed by DECC-G and MOS. All three algorithms obtain better results than M-ELM. The experimentation was carried out on 38 classification problems recognized by the scientific community, while Friedman and Wilcoxon nonparametric statistical tests support the results.

Research paper thumbnail of Strengthening Competencies for Building Software, Through a Community of Practice

The present work describes the use of a virtual community of practice for the strengthening of ca... more The present work describes the use of a virtual community of practice for the strengthening of capacities for software development; the study was carried out with students of informatics and related areas of higher education institutions in the southwest of Colombia. The present study was conducted in the following stages: initial approach, diagnostic, preparation, implementation, and follow-up. Results obtained allow evidence that the virtual community positively influences in the acquisition of knowledge, capabilities, and attitudes of the members of the community. The latter increases possibilities of entering the market.

Research paper thumbnail of Multiobjective Memetic GRASP to Solve Vehicle Routing Problems with Time Windows Size

International Journal on Advanced Science, Engineering and Information Technology

Research paper thumbnail of Dimensional Modeling of Ict Competencies

Research paper thumbnail of Modelamiento Dimensional De Competencias en Tic

Revista EIA, Jan 12, 2013

Research paper thumbnail of Web document clustering based on a new niching Memetic Algorithm, Term-Document Matrix and Bayesian Information Criterion

This paper introduces a new description-centric algorithm for web document clustering based on Me... more This paper introduces a new description-centric algorithm for web document clustering based on Memetic Algorithms with Niching Methods, Term-Document Matrix and Bayesian Information Criterion. The algorithm defines the number of clusters automatically. The Memetic Algorithm provides a combined global and local strategy for a search in the solution space and the Niching methods to promote diversity in the population and prevent the population from converging too quickly (based on restricted competition replacement and restrictive mating). The Memetic Algorithm uses the K-means algorithm to find the optimum value in a local search space. Bayesian Information Criterion is used as a fitness function, while FP-Growth is used to reduce the high dimensionality in the vocabulary. This resulting algorithm, called WDC-NMA, was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm). Also, it was also then initially evaluated by a group of users.

Research paper thumbnail of Framework for the Training of Deep Neural Networks in TensorFlow Using Metaheuristics

Lecture Notes in Computer Science, 2018

Artificial neural networks (ANN) again are playing a leading role in machine learning, especially... more Artificial neural networks (ANN) again are playing a leading role in machine learning, especially in classification and regression processes, due to the emergence of deep learning (ANNs with more than four hidden layers), allowing them to encode more and more complex features. The increase in the number of hidden layers in ANNs has posed important challenges in their training. Variations (e.g. RMSProp) of classical algorithms such as backpropagation with its stochastic gradient descent are the state of the art for training deep ANNs. However, other research has shown that the advantages of metaheuristics need more detailed study in this area. We summarize the design and use of a framework to optimize learning of deep neural networks in TensorFlow using metaheuristics, a framework implemented in Python that allows training of the networks in CPU or GPU depending on the TensorFlow configuration and allows easy integration of diverse classification and regression problems solved with different neural networks architectures (conventional, convolutional and recurrent) and new metaheuristics. The framework initially includes Particle Swarm Optimization, Global-best Harmony Search, and Differential Evolution. It further enables the conversion of metaheuristics into memetic algorithms including exploitation processes using the algorithms available in TensorFlow: RMSProp, Adam, Adadelta, Momentum, and Adagrad.

Research paper thumbnail of Covering Arrays to Support the Process of Feature Selection in the Random Forest Classifier

Lecture Notes in Computer Science, 2019

The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from... more The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from Bootstrap subsets of the original dataset. Each subset is a sample of instances (rows) by a random subset of features (variables or columns) of the original dataset to be classified. In RF, pruning is not applied in the generation of base trees and in the classification process of a new record, each tree issues a vote enabling the selected class to be defined, as that with the most votes. Bearing in mind that in the state of the art it is defined that random feature selection for constructing the Bootstrap subsets decreases the quality of the results achieved with RF, in this work the integration of covering arrays (CA) in RF is proposed to solve this situation, in an algorithm called RFCA. In RFCA, the number N of rows of the CA defines the lowest number of base trees that require to be generated in RF and each row of the CA defines the features that each Bootstrap subset will use in the creation of each tree. To evaluate the new proposal, 32 datasets available in the UCI repository are used and compared with the RF available in Weka. The experiments show that the use of a CA of strength 2 to 7 obtains promising results in terms of accuracy.

Research paper thumbnail of Automatic Generation of Multi-document Summaries Based on the Global-Best Harmony Search Metaheuristic and the LexRank Graph-Based Algorithm

Lecture Notes in Computer Science, 2018

Recently, metaheuristic based algorithms have shown good results in generating automatic multi-do... more Recently, metaheuristic based algorithms have shown good results in generating automatic multi-document summaries. This paper proposes two algorithms that hybridize the metaheuristic of Global Best Harmony Search and the LexRank Graph based algorithm, called LexGbhs and GbhsLex. The objective function to be optimized is composed of the features of coverage and diversity. Coverage measures the similarity between each sentence of the candidate summary and the centroid of the sentences of the collection of documents, while diversity measures how different the sentences that make up a candidate summary are. The two proposed hybrid algorithms were compared with state of the art algorithms using ROUGE-1, ROUGE-2 and ROUGE-SU4 measurements for the DUC2005 and DUC2006 data sets. After a unified classification was carried out, the LexGbhs algorithm proposed ranked third, showing that the hybridization of metaheuristics with graphs in the generation of extractive summaries of multiple documents is a promising line of research.

Research paper thumbnail of Finding Optimal Farming Practices to Increase Crop Yield Through Global-Best Harmony Search and Predictive Models, a Data-Driven Approach

Lecture Notes in Computer Science, 2018

Research paper thumbnail of Acercamiento a las buenas prácticas para el desarrollo de software basado en DevOps y SCRUM utilizadas en empresas muy pequeñas

Revista Facultad de Ingeniería

Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un ... more Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un limitado flujo de caja y tiempo para implementar mejoras en sus procesos que les permita ser más competitivos. Esta es una de las razones por las que estas empresas recurren a la implementación de marcos de trabajo ágil como SCRUM para gestionar el proceso de desarrollo de software. Pero cuando inician su adopción, encuentran que los documentos solo sugieren los cambios que se pueden realizar, pero no como hacerlos, tornando el proceso de descubrir cuales técnicas, eventos y artefactos son los que deben implementar en un enfoque de prueba y error costoso y en algunos casos inviable. Lo mismo sucede con otros marcos que pueden ser complementarios a SCRUM como DevOps, que propone un acercamiento entre el área de desarrollo y operaciones, donde se automaticen la mayor cantidad de tareas y se incrementen los controles de calidad para obtener mejores productos. Este artículo expone tres buena...

Research paper thumbnail of Predicción del rendimiento de cultivos de café: un mapeo sistemático

Ingeniería y Competitividad, Sep 19, 2023

Research paper thumbnail of Grouping of business processes models based on an incremental clustering algorithm using fuzzy similarity and multimodal search

Expert Systems With Applications, 2017

A model for searching business processes, based on a multimodal approach that integrates textual ... more A model for searching business processes, based on a multimodal approach that integrates textual and structural information.A clustering mechanism that uses a similarity function based on fuzzy logic for grouping search results.Evaluation of search method using internal quality assessment and external assessment based on human criteria. Nowadays, many companies standardize their operations through Business Process (BP), which are stored in repositories and reused when new functionalities are required. However, finding specific processes may become a cumbersome task due to the large size of these repositories. This paper presents MulTimodalGroup, a model for grouping and searching business processes. The grouping mechanism is built upon a clustering algorithm that uses a similarity function based on fuzzy logic; this grouping is performed using the results of each user request. By its part, the search is based on a multimodal representation that integrates textual and structural information of BP. The assessment of the proposed model was carried out in two phases: 1) internal quality assessment of groups and 2) external assessment of the created groups compared with an ideal set of groups. The assessment was performed using a closed BP collection designed collaboratively by 59 experts. The experimental results in each phase are promising and evidence the validity of the proposed model.

Research paper thumbnail of Multiobjective Memetic GRASP to Solve Vehicle Routing Problems with Time Windows Size

International Journal on Advanced Science, Engineering and Information Technology, Aug 31, 2022

Research paper thumbnail of Teaching Guide for Beginnings in DevOps and Continuous Delivery in AWS Focused on the Society 5.0 Skillset

Revista Iberoamericana De Tecnologías Del Aprendizaje, Nov 1, 2022

Research paper thumbnail of Documenting and implementing DevOps good practices with test automation and continuous deployment tools through software refinement

Periodicals of Engineering and Natural Sciences (PEN), Nov 3, 2021

Research paper thumbnail of TrazasBP: A Framework for Business Process Models Discovery Based on Execution Cases

Research paper thumbnail of A Multi-Objective Approach for the Calibration of Microscopic Traffic Flow Simulation Models

Research paper thumbnail of Vegetation Index Based on Genetic Programming for Bare Ground Detection in the Amazon

Lecture Notes in Computer Science, 2018

Research paper thumbnail of Memetic Algorithm Based on Global-Best Harmony Search and Hill Climbing for Part of Speech Tagging

Lecture Notes in Computer Science, 2017

The task of assigning tags to the words of a sentence has many applications today in natural lang... more The task of assigning tags to the words of a sentence has many applications today in natural language processing (NLP) and therefore requires a fast and accurate algorithm. This paper presents a Part-of-Speech Tagger based on Global-Best Harmony Search (GBHS) which includes local optimization (based on the Hill Climbing algorithm that includes knowledge of the problem to define the neighborhood) for the best harmony after each improvisation (iteration). In the proposed algorithm, a candidate solution (harmony) is represented as a vector of the size of the numbers of word in a sentence, while the fitness function considers the cumulative probability of tagging each word and its relation to its predecessor and successor word. The proposed algorithm obtained 95.2% precision values and improved on the results obtained by other taggers. The experimental results were analyzed with Friedman non-parametric statistical tests, with a level of significance of 90%. The proposed Part-of-Speech Tagger algorithm was found to perform with quality and efficiency in the tagging problem, in contrast to the comparison algorithms. The Brown corpus divided into 5 folders was used to conduct the experiments, thereby allowing application of cross-validation.

Research paper thumbnail of Optimization of Neural Network Training with ELM Based on the Iterative Hybridization of Differential Evolution with Local Search and Restarts

Lecture Notes in Computer Science, 2019

An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural netw... more An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural network (SLFN) in less time than the back-propagation algorithm. An ELM defines the input weights and biases of the hidden layer with random values, and then analytically calculates the output weights. The use of random values causes SLFN performance to decrease significantly. The present work carries out the adaptation of three continuous optimization algorithms of high dimensionality (IHDELS, DECC-G and MOS) and compares their performance to each other and with the state-of-the-art method, a memetic algorithm based on differential evolution called M-ELM. The results of the comparison show that IHDELS using a validation model based on retention (Training/Testing) obtains the best results, followed by DECC-G and MOS. All three algorithms obtain better results than M-ELM. The experimentation was carried out on 38 classification problems recognized by the scientific community, while Friedman and Wilcoxon nonparametric statistical tests support the results.

Research paper thumbnail of Strengthening Competencies for Building Software, Through a Community of Practice

The present work describes the use of a virtual community of practice for the strengthening of ca... more The present work describes the use of a virtual community of practice for the strengthening of capacities for software development; the study was carried out with students of informatics and related areas of higher education institutions in the southwest of Colombia. The present study was conducted in the following stages: initial approach, diagnostic, preparation, implementation, and follow-up. Results obtained allow evidence that the virtual community positively influences in the acquisition of knowledge, capabilities, and attitudes of the members of the community. The latter increases possibilities of entering the market.

Research paper thumbnail of Multiobjective Memetic GRASP to Solve Vehicle Routing Problems with Time Windows Size

International Journal on Advanced Science, Engineering and Information Technology

Research paper thumbnail of Dimensional Modeling of Ict Competencies

Research paper thumbnail of Modelamiento Dimensional De Competencias en Tic

Revista EIA, Jan 12, 2013

Research paper thumbnail of Web document clustering based on a new niching Memetic Algorithm, Term-Document Matrix and Bayesian Information Criterion

This paper introduces a new description-centric algorithm for web document clustering based on Me... more This paper introduces a new description-centric algorithm for web document clustering based on Memetic Algorithms with Niching Methods, Term-Document Matrix and Bayesian Information Criterion. The algorithm defines the number of clusters automatically. The Memetic Algorithm provides a combined global and local strategy for a search in the solution space and the Niching methods to promote diversity in the population and prevent the population from converging too quickly (based on restricted competition replacement and restrictive mating). The Memetic Algorithm uses the K-means algorithm to find the optimum value in a local search space. Bayesian Information Criterion is used as a fitness function, while FP-Growth is used to reduce the high dimensionality in the vocabulary. This resulting algorithm, called WDC-NMA, was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm). Also, it was also then initially evaluated by a group of users.

Research paper thumbnail of Framework for the Training of Deep Neural Networks in TensorFlow Using Metaheuristics

Lecture Notes in Computer Science, 2018

Artificial neural networks (ANN) again are playing a leading role in machine learning, especially... more Artificial neural networks (ANN) again are playing a leading role in machine learning, especially in classification and regression processes, due to the emergence of deep learning (ANNs with more than four hidden layers), allowing them to encode more and more complex features. The increase in the number of hidden layers in ANNs has posed important challenges in their training. Variations (e.g. RMSProp) of classical algorithms such as backpropagation with its stochastic gradient descent are the state of the art for training deep ANNs. However, other research has shown that the advantages of metaheuristics need more detailed study in this area. We summarize the design and use of a framework to optimize learning of deep neural networks in TensorFlow using metaheuristics, a framework implemented in Python that allows training of the networks in CPU or GPU depending on the TensorFlow configuration and allows easy integration of diverse classification and regression problems solved with different neural networks architectures (conventional, convolutional and recurrent) and new metaheuristics. The framework initially includes Particle Swarm Optimization, Global-best Harmony Search, and Differential Evolution. It further enables the conversion of metaheuristics into memetic algorithms including exploitation processes using the algorithms available in TensorFlow: RMSProp, Adam, Adadelta, Momentum, and Adagrad.