Luis Zarate - Academia.edu (original) (raw)

Papers by Luis Zarate

Expert Systems with Applications, 2013

Due to its ability to handle nonlinear problems, artificial neural networks are applied in severa... more Due to its ability to handle nonlinear problems, artificial neural networks are applied in several areas of science. However, the human elements are unable to assimilate the knowledge kept in those networks, since such knowledge is implicitly represented by their connections and the respective numerical weights. In recent formal concept analysis, through the FCANN method, it has demonstrated a powerful methodology for extracting knowledge from neural networks. However, depending on the settings used or the number of the neural network variables, the number of formal concepts and consequently of rules extracted from the network can make the process of knowledge and learning extraction impossible. Thus, this paper addresses the application of the JBOS approach to extracted reduced knowledge from the formal contexts extracted by FCANN from the neural network. Thus, providing a small number of formal concepts and rules for the final user, without losing the ability to understand the process learned by the network.

Proceedings of the 11th International Conference on Computer Supported Education

This article presents the application of three induction rules algorithm: OneR, RIPPER and PART i... more This article presents the application of three induction rules algorithm: OneR, RIPPER and PART in an educational data set aiming to explain the main factors that lead students to be succeed or failure in online course. The dataset used to develop this article was extracted from the log of activities of engineering students that enrolled in a 20 weeks course of Algorithm offered online. The students used Learning Management System, Moodle. The dataset was preprocessed and then it was applied the algorithms into it. As result it was observed that students who begin earlier an assignment improve their probability of succeed.

Normally efficiency measurements of solar collectors are made by means of experiments that use as... more Normally efficiency measurements of solar collectors are made by means of experiments that use as operational parameters: solar irradiance, flow rate, and input and output water temperatures. The efficiency depends on some other aspects too, as of its structural aspects, material of its components, thermal insulation and position. For new operating conditions, new experiments are necessary to calculate efficiency. Linear regression has been proposed by several authors as a way of modeling solar collector. The linear regression may introduce significant errors when used with that purpose, due to its limitation of working better only with linear correlated values. Due to their facility of solving nonlinear problems, ANN (Artificial Neural Networks) are presented here as an alternative to represent these solar collectors with several advantages on other techniques of modeling like linear regression. ANN correctly trained do not require new experiments or linear correlated values to obt...

Resumo: Ontologies, since its formal definition in the 90’s up to its effective usage with the ad... more Resumo: Ontologies, since its formal definition in the 90’s up to its effective usage with the advent of methods and processes from the Ontology Engineering, have received important attention in projects that demand the formalization of shared knowledge among applications and users. In this context, it is important to popularize and to make people able to create and use them. This article presents a method to assist the ontology capture process, providing apparatus to conceptualize and identify the treated domain trough an ontological study. The proposed method provides metrics and guidelines so that an ontology engineer can identify and organize the elements of a domain, finding fundamental ontological relations among them. A descriptive algorithm is shown to formalize the wished process and some examples are given to better exemplify the utilization of the proposed method. Abstract: Ontologies, since its formal definition in the 90’s up to its effective usage with the advent of me...

The use of sensors in environments where they require constant monitoring has been increasing in ... more The use of sensors in environments where they require constant monitoring has been increasing in recent years. The main goal is to guarantee the effectiveness, safety, and smooth functioning of the system. To identify the occurrence of abnormal events, we propose a methodology that aims to detect patterns that can lead to abrupt changes in the behavior of the sensor signals. To achieve this objective, we provide a strategy to characterize the time series, and we use a clustering technique to analyze the temporal evolution of the sensor system. To validate our methodology, we propose the clusters’ stability index by windowing. Also, we have developed a parameterizable time series generator, which allows us to represent different operational scenarios for a sensor system where extreme anomalies may arise. CCS Concepts: • Applied computing;

Resumo: O estudo de processos do mundo real pode se tornar uma tarefa ar- dua. Objetos reais, esp... more Resumo: O estudo de processos do mundo real pode se tornar uma tarefa ar- dua. Objetos reais, especialmente aqueles relacionados a processos industriais, sao dificieis de modelar e compreender. Este trabalho busca reduzir a distância entre modelar um processo industrial e compreende-lo, baseando-se em Analise Formal de Conceitos para obter regras a partir de redes neurais previamente treinadas. Neste artigo uma ferramenta, Sophiann, e apresentada. A ideia prin- cipal e auxiliar usuarios na aprendizagem de processos industriais complexos. Abstract: Learning about real world process is not an easy task. Real objects, specially those related to industrial process, are difficult to model and under- stand. This work aims to reduce the gap between modelling an industrial pro- cess and understanding it. It is based on Formal Concept Analysis to obtain rules from neural networks previously trained. In this paper a tool, named Sophiann, is presented. The main idea is to help users learn abou...

Advances in Business Information Systems and Analytics

Feature selection is a process of the data preprocessing task in business intelligence (BI), anal... more Feature selection is a process of the data preprocessing task in business intelligence (BI), analytics, and data mining that urges for new methods that can handle with high dimensionality. One alternative that have been researched to deal with the curse of dimensionality is causal feature selection. Causal feature selection is not based on correlation, but the causality relationship among variables. The main goal of this chapter is to present, based on the issues identified on other methods, a new strategy that considers attributes beyond those that compounds the Markov blanket of a node and calculate the causal effect to ensure the causality relationship.

Proceedings of the 35th Annual ACM Symposium on Applied Computing

Triadic Concept Analysis (TCA) is an applied mathematical technique for data analysis which the r... more Triadic Concept Analysis (TCA) is an applied mathematical technique for data analysis which the relations between objects, attributes and conditions are identified. However, the volume of information to be processed could make TCA impracticable. For example, with the increasing of social network for personal (Facebook) and professional (LinkedIn) usage, more and more applications of data analysis on environments with high dimensionality (Big Data) have been discussed in the literature. This paper has as an objective to evaluate the behavior of the TRIAS algorithm in order to extract triadic concepts in high dimensional contexts. The experiments shows that our approach has a better performance - up to 33% faster - than its original algorithm.

Information

Formal concept analysis (FCA) is largely applied in different areas. However, in some FCA applica... more Formal concept analysis (FCA) is largely applied in different areas. However, in some FCA applications the volume of information that needs to be processed can become unfeasible. Thus, the demand for new approaches and algorithms that enable processing large amounts of information is increasing substantially. This article presents a new algorithm for extracting proper implications from high-dimensional contexts. The proposed algorithm, called ImplicPBDD, was based on the PropIm algorithm, and uses a data structure called binary decision diagram (BDD) to simplify the representation of the formal context and enhance the extraction of proper implications. In order to analyze the performance of the ImplicPBDD algorithm, we performed tests using synthetic contexts varying the number of objects, attributes and context density. The experiments show that ImplicPBDD has a better performance—up to 80% faster—than its original algorithm, regardless of the number of attributes, objects and dens...

Proceedings of the 20th International Conference on Enterprise Information Systems, 2018

Formal concept analysis (FCA) is currently used in a large number of applications in different ar... more Formal concept analysis (FCA) is currently used in a large number of applications in different areas. However, in some applications the volume of information that needs to be processed may become infeasible. Thus, demand for new approaches and algorithms to enable the processing of large amounts of information is increasing substantially. This paper presents a new algorithm for extracting proper implications from highdimensional contexts. The proposed algorithm, ProperImplicBDD, was based on the PropIm algorithm. Using a data structure called binary decision diagram (BDD) it is possible to simplify the representation of the formal context and to improve the performance on extracting proper implications. In order to analyze the performance of the ProperImplicBDD algorithm, we performed tests using synthetic contexts varying the number of attributes and context density. The experiments shown that ProperImplicBDD has a better perfomance-up to 8 times faster-than the original one, regardless of the number of attributes, objetcts and densities.

IEEE Latin America Transactions

This work involves the construction of a hybrid approach to solve SAT and UNSAT problems. The sol... more This work involves the construction of a hybrid approach to solve SAT and UNSAT problems. The solution combines techniques found in Stalmarck and DPLL algorithms for solving propositional satisfiability problems. We extend the Stalmarck rules in order to reduce computational cost during the deduction phase. We applied our solution to SAT and UNSAT instances. From the results it was found that this approach showed good results for UNSAT industrial instances. In some cases, our approach obtained gains of 45% to 60% in terms of efficient when compared to existing approaches. We compared our solution against the best known SAT solvers such as ZChaff and RSat. The techniques and methods involved are described in the article.

Journal of the Brazilian Society of Mechanical Sciences and Engineering

The single stand rolling mill governing equation is a non-linear function on several parameters (... more The single stand rolling mill governing equation is a non-linear function on several parameters (input thickness, front and back tensions, yield stress and friction coefficient among others). Any alteration in one of them will cause alterations on the rolling load and, consequently, on the outgoing thickness. This paper presents a method to determinate the appropriate adjustment for thickness control considering three possible control parameters: roll gap, front and back tensions. The method uses a predictive model based in the sensitivity equation of the process, where the sensitivity factors are obtained by differentiating a neural network previously trained. The method considers as the best control action the one that demands the smallest adjustment. One of the capital issues in the controller design for rolling systems is the difficulty to measure the final thickness without time delays. The time delay is a consequence of the location of the outgoing thickness sensor that is always placed to some distance to the front of the roll gap. The proposed control system calculates the necessary adjustment based on a predictive model for the output thickness. This model permits to overcome the time delay that exists in such processes and can eliminate the thickness sensor, usually based on X-ray. Simulation results show the viability of the proposed technique.

Anais do 9. Congresso Brasileiro de Redes Neurais

In this work is proposed a hybrid structure to simulate thermal behaviors of pools. The structure... more In this work is proposed a hybrid structure to simulate thermal behaviors of pools. The structure uses neural representation to model the climatic data and parametric estimation to determine the variation in volume due the human activity. The new structure allows adapting the theoretical dynamic models, with variations over time for specific regional weather conditions. As a case of study, data of state of Minas Gerais (MG)-Brazil were used.

Anais do 7. Congresso Brasileiro de Redes Neurais

Resumo-Dados ausentes em bancos de dados são hoje considerados um dos maiores problemas enfrentad... more Resumo-Dados ausentes em bancos de dados são hoje considerados um dos maiores problemas enfrentados na aplicação de Data Mining. No tratamento destes dados é necessário que as características do banco sejam preservadas, ou seja, que não haja informação perdida nem adicionada sem uma análise mais cuidadosa. O objetivo deste trabalho é mostrar como as Redes Neurais Artificiais junto com o conhecimento tácito do especialista no domínio, podem ajudar a recuperar informações dos atributos ausentes. Neste trabalho, esses dois elementos são combinados para recuperar dados ausentes numa base de dados mercadológicos.

BMC Bioinformatics, 2017

Background: The correct protein coding region identification is an important and latent problem i... more Background: The correct protein coding region identification is an important and latent problem in the molecular biology field. This problem becomes a challenge due to the lack of deep knowledge about the biological systems and unfamiliarity of conservative characteristics in the messenger RNA (mRNA). Therefore, it is fundamental to research for computational methods aiming to help the patterns discovery for identification of the Translation Initiation Sites (TIS). In the field of Bioinformatics, machine learning methods have been widely applied based on the inductive inference, as Inductive Support Vector Machine (ISVM). On the other hand, not so much attention has been given to transductive inference-based machine learning methods such as Transductive Support Vector Machine (TSVM). The transductive inference performs well for problems in which the amount of unlabeled sequences is considerably greater than the labeled ones. Similarly, the problem of predicting the TIS may take advantage of transductive methods due to the fact that the amount of new sequences grows rapidly with the progress of Genome Project that allows the study of new organisms. Consequently, this work aims to investigate the transductive learning towards TIS identification and compare the results with those obtained in inductive method. Results: The transductive inference presents better results both in F-measure and in sensitivity in comparison with the inductive method for predicting the TIS. Additionally, it presents the least failure rate for identifying the TIS, presenting a smaller number of False Negatives (FN) than the ISVM. The ISVM and TSVM methods were validated with the molecules from the most representative organisms contained in the RefSeq database: Rattus norvegicus, Mus musculus, Homo sapiens, Drosophila melanogaster and Arabidopsis thaliana. The transductive method presented F-measure and sensitivity higher than 90% and also higher than the results obtained with ISVM. The ISVM and TSVM approaches were implemented in the TransduTIS tool, TransduTIS-I and TransduTIS-T respectively, available in a web interface. These approaches were compared with the TISHunter, TIS Miner, NetStart tools, presenting satisfactory results. Conclusions: In relation to precision, the results are similar for the ISVM and TSVM classifiers. However, the results show that the application of TSVM approach ensured an improvement, specially for F-measure and sensitivity. Moreover, it was possible to identify a potential for the application of TSVM, which is for organisms in the initial study phase with few identified sequences in the databases.

Artificial Intelligence Research, 2016

This paper addresses the problem of handling dense contexts of high dimensionality in the number ... more This paper addresses the problem of handling dense contexts of high dimensionality in the number of objects, which is still an open problem in formal concept analysis. The generation of minimal implication basis in contexts with such characteristics is investigated, where the NextClosure algorithm is employed in obtaining the rules. Therefore, this work makes use of parallel computing as a means to reduce the prohibitive times observed in scenarios where the input context has high density and high dimensionality. The sequential and parallel versions of the NextClosure algorithm applied to generating implications are employed. The experiments show a reduction of approximately 75% in execution time in the contexts of greater size and density, which attests to the viability of the strategy presented in this work.

2009 16th Annual IEEE International Conference and Workshop on the Engineering of Computer Based Systems, 2009

In recent years new and efficient symbolic model checking algorithms have been developed. One tec... more In recent years new and efficient symbolic model checking algorithms have been developed. One technique, bounded model checking or BMC, has been particularly promising. BMC models the system being verified as a boolean formula whose satisfying assignments provide counterexamples for properties verified. BMC unrolls the system in its multiple iterations. Because of this the structure of the formula representing the

In this paper, two methods for extraction of knowledge rules through Artificial Neural Networks, ... more In this paper, two methods for extraction of knowledge rules through Artificial Neural Networks, with continuous activation functions are presented. Those rules are extracted from neural networks previously trained and of the sensitivity factors obtained by the differentiation of a neural network. The rules can be used when analytic models of the physical processes lead to equations of difficult numerical and analytical solutions. In the operation of industrial processes, the rules can help for the taking of decision of less experienced operators. The proposed methods will be applied for the obtaining of knowledge rules for the cold rolling process.

CAC-RD is call admission control (CAC) for UMTS (Universal Mo-bile Terrestrial System) 3G network... more CAC-RD is call admission control (CAC) for UMTS (Universal Mo-bile Terrestrial System) 3G networks. It is based on resource reservation and network diagnosis, ensuring availability and reducing blockings. Simulations of big scenarios with thousands of users for tests of CAC-RD were not possible due to computational resource limits. The maximum possible scenario was with three antennas and 1100 users. To resolve this problem with the aim to make possible a network simulation with up to hundreds of antennas and ten thou-sands of users, this work presents CAC-RD Neural: a representation of CAC-RD through a neural computational model. Results show gains related with the increase in the network scenario and time. Resumo. CAC-R e um controle de admissão de chamadas (CAC) para re-des UMTS (Universal Mobile Terrestrial System) 3G. Ele se baseia em reserva de recursos e diagnóstico da rede, garantindo disponibilidade e reduzindo blo-queios. Simula oes de grandes cenários com milhares de usuá...