Marek Reformat | University of Alberta (original) (raw)

Papers by Marek Reformat

Electronics Letters, Nov 25, 2004

Fuzzy cognitive maps (FCM) are a powerful and convenient tool for describing and analysing dynami... more Fuzzy cognitive maps (FCM) are a powerful and convenient tool for describing and analysing dynamic systems. Their generic design is performed manually, exploits expert knowledge and is quite tedious, especially in the case of larger systems. This shortcoming is alleviated by completing the design of FCMs through learning carried out on experimental data. Comprehensive experiments reveal that this approach helps design models of required accuracy in an automated manner.

Soft Computing, 2008

This paper discusses some initial investigations into the application of genetic programming tech... more This paper discusses some initial investigations into the application of genetic programming technology as a vehicle for re-examining some existing approaches within the software life-cycle. Specifically, it outlines a new direction in production techniques—software cloning from executable specifications or source code. It explores the possibility and advantages of producing a system from its external interactions. To allow this production to be automatic, the system assumes that it can view (and potentially manipulate) these external interactions of the original system; and hence it assumes the existence of either an executable specification or the source code—an object to assist in the generation of the external interactions; i.e. the system is treated as a black-box. Although the generation and application of software clones is relatively unexplored, it is believed that this is a fundamental technology that can have many different applications within a software engineering environment. For example, software clones could be used in: complexity measurement, software testing and software fault tolerance. Clearly, for these clones to be usable, their production needs to be automated. An interesting approach to this automatic production or generation problem is the application of evolutionary-based Genetic Programming (GP). Using the paradigms of best fit, selection, crossover and mutation, a number of clones, satisfying specific requirements, can be automatically generated. In general, GP is a flexible and powerful algorithm suitable for solving a variety of different problems. This paper presents the results of studies that have been conducted in order to answer questions related to feasibility of using GP for clone generation: what features of GP are important? What works and what does not? How can the GP be “tuned” for the problem? The results have been used to draw a set of suggestions and conclusions that indicate possible usability of GP-based approach to automatic generation of clones.

Empirical Software Engineering, 2007

GUI systems are becoming increasingly popular thanks to their ease of use when compared against t... more GUI systems are becoming increasingly popular thanks to their ease of use when compared against traditional systems. However, GUI systems are often challenging to test due to their complexity and special features. Traditional testing methodologies are not designed to deal with the complexity of GUI systems; using these methodologies can result in increased time and expense. In our proposed strategy, a GUI system will be divided into two abstract tiers—the component tier and the system tier. On the component tier, a flow graph will be created for each GUI component. Each flow graph represents a set of relationships between the pre-conditions, event sequences and post-conditions for the corresponding component. On the system tier, the components are integrated to build up a viewpoint of the entire system. Tests on the system tier will interrogate the interactions between the components. This method for GUI testing is simple and practical; we will show the effectiveness of this approach by performing two empirical experiments and describing the results found.

Empirical Software Engineering, 2007

Software testing is an essential process in software development. Software testing is very costly... more Software testing is an essential process in software development. Software testing is very costly, often consuming half the financial resources assigned to a project. The most laborious part of software testing is the generation of test-data. Currently, this process is principally a manual process. Hence, the automation of test-data generation can significantly cut the total cost of software testing and the software development cycle in general. A number of automated test-data generation approaches have already been explored. This paper highlights the goal-oriented approach as a promising approach to devise automated test-data generators. A range of optimization techniques can be used within these goal-oriented test-data generators, and their respective characteristics, when applied to these situations remain relatively unexplored. Therefore, in this paper, a comparative study about the effectiveness of the most commonly used optimization techniques is conducted.

Information & Software Technology, 2006

Software testing is probably the most complex task in the software development cycle. It is one o... more Software testing is probably the most complex task in the software development cycle. It is one of the most time-consuming and frustrating process. The complexity of software systems has been increasing dramatically in the past decade, and software testing as a labour-intensive component is becoming more and more expensive. With the complexity of the software, the cost of testing software is also increased. Thus with automatic test data generation the cost of testing will dramatically be reduced. This paper uses a program dependence analysis and genetic algorithms to generate test data automatically.

Research in evolvable hardware includes application of evolutionary methods in a process of syste... more Research in evolvable hardware includes application of evolutionary methods in a process of system design and on-line system adaptation. Building a hardware system capable of performing adaptation processes without external computing resources is a necessary step towards achieving a high level autonomy. Such a system, called Autonomous Genetic Machine (AGM), is presented in this paper. It represents a hardware realization of a genetic algorithm. Modular architecture of the AGM ensures its ease for modifications and suitability for different applications. A detailed description of design stages and implementation issues is included with emphasis focused on performance-related topics. We elaborate on the role of crucial parameters of genetic optimization such as population size, chromosome length as well as discuss various possibilities in the formation of the fitness function. In particular, we show how to design AGM for Altera's FPGAs.

The Introduction of web 2.0 and social software make significant changes in users’ utilization of... more The Introduction of web 2.0 and social software make significant changes in users’ utilization of the web. User involvement in processes restricted so far to system designers and developers is more and more evident. One of the examples of such involvement is tagging. Tagging is a process of labeling (annotating) digital items – called resources – by users. The labels – called tags – assigned to those resources reflect users’ ways of seeing, categorizing, and perceiving particular items. As the result a network of interconnected resources and tags is created. Connections between resources and tags are weighted with numbers reflecting how many times a given tag has been used to label a resource. A network of resources and tags constitutes an environment suitable for building fuzzy representations of those resources, as well as tags. This simple concept is investigated here. The paper describes principles of the concept and shows some examples of its utilization. A short discussion dedicated to interrelations between tagging and fuzziness is included.

International Journal on Semantic Web and Information Systems, 2008

Abstract The Internet holds huge amount of documents available for users. Effective utilization o... more Abstract The Internet holds huge amount of documents available for users. Effective utilization of this enormous repository means a need for systems supporting users in a process of finding related documents. An ontology defined in the framework of the Semantic Web (Berners, 2001) allows for specification of concepts, their instances, and relationships existing between concepts. A hierarchy of concepts (Yager, 2000) is a graph-like structure providing a means for representing human-like dependencies. The article proposes an ...

Soft Computing, 2008

A pervasive task in many forms of human activity is classification. Recent interest in the classi... more A pervasive task in many forms of human activity is classification. Recent interest in the classification process has focused on ensemble classifier systems. These types of systems are based on a paradigm of combining the outputs of a number of individual classifiers. In this paper we propose a new approach for obtaining the final output of ensemble classifiers. The method presented here uses the Dempster–Shafer concept of belief functions to represent the confidence in the outputs of the individual classifiers. The combing of the outputs of the individual classifiers is based on an aggregation process which can be seen as a fusion of the Dempster rule of combination with a generalized form of OWA operator. The use of the OWA operator provides an added degree of flexibility in expressing the way the aggregation of the individual classifiers is performed.

IEEE Transactions on Fuzzy Systems, 1997

Abstract We discuss a problem of rule-based fuzzy modeling of multiple-input single-output nonlin... more Abstract We discuss a problem of rule-based fuzzy modeling of multiple-input single-output nonlinear relationships f: R n→ R. The model under investigation is viewed as a collection of conditional statements “if state Ω, then y= gi (x, at)”, i= 1, 2,... N with Ω i being a fuzzy relation defined in the space of the input variables. In contrast to the commonly encountered identification approach, based exclusively upon discrete experimental data, the one proposed in this study is concerned with the rule-based modeling exploiting the available ...

Data & Knowledge Engineering, 2008

Associative-classification is a promising classification method based on association-rule mining.... more Associative-classification is a promising classification method based on association-rule mining. Significant amount of work has already been dedicated to the process of building a classifier based on association rules. However, relatively small amount of research has been performed in association-rule mining from multi-label data. In such data each example can belong, and thus should be classified, to more than one class. This paper aims at the most demanding, with respect to computational cost, part in associative-classification, which is efficient generation of association rules. This task can be achieved using different frequent pattern mining methods. In this paper, we propose a new method that is based on the state-of-the-art tree-projection-based frequent pattern mining algorithm. This algorithm is modified to improve its efficiency and extended to accommodate the multi-label recurrent-item associative-classification rule generation. The proposed algorithm is tested and compared with A priori-based associative-classification rule generator on two large datasets.

Fuzzy Sets and Systems, 2004

Software quality is one of the most important practical features of software development. Project... more Software quality is one of the most important practical features of software development. Project managers and developers look for methods and tools supporting software development processes and ensuring a required level of quality. To make such tools relevant, they should provide the designer/manager with some quantitative input useful for purposes of interpretation of the results. Knowledge provided by the tools leads to better understanding of the investigated phenomena. In this paper, we propose a comprehensive development methodology of logic-based models represented by fuzzy neural networks. A process of model development is performed in the stages of structural and parametric optimization. The structural optimization framework utilizes mechanisms of evolutionary computing, which become especially attractive in light of the structural optimization of the models. The parametric optimization is performed using the gradient-base method. The study comprises two detailed case studies dealing with existing software data. The ÿrst one deals with the quality assessment of software objects in an exploratory biomedical data analysis and visualization system. The second case is concerned with the model of software development e ort discussed in the setting of some medical information system.

Information & Software Technology, 2003

Quality of individual objects composing a software system is one of the important factors that de... more Quality of individual objects composing a software system is one of the important factors that determine quality of this system. Quality of objects, on the other hand, can be related to a number of attributes, such as extensibility, reusability, clarity and efficiency. These attributes do not have representations suitable for automatic processing. There is a need to find a way to support quality related activities using data gathered during quality assurance processes, which involve humans.

Fuzzy cognitive maps (FCMs) form a convenient, simple, and powerful tool for simulation and analy... more Fuzzy cognitive maps (FCMs) form a convenient, simple, and powerful tool for simulation and analysis of dynamic systems. The popularity of FCMs stems from their simplicity and transparency. While being successful in a variety of application domains, FCMs are hindered by necessity of involving domain experts to develop the model. Since human experts are subjective and can handle only relatively simple networks (maps), there is an urgent need to develop methods for automated generation of FCM models. This study proposes a novel evolutionary learning that is able to generate FCM models from input historical data, and without any human intervention. The proposed method is based on genetic algorithms, and is carried out through supervised learning. The paper tests the method through a series of carefully selected experimental studies.

IEEE Transactions on Neural Networks, 2006

In this paper, we are concerned with the concept of fuzzy logic networks and logic-based data ana... more In this paper, we are concerned with the concept of fuzzy logic networks and logic-based data analysis realized within this framework. The networks under discussion are homogeneous architectures comprising of OR/AND neurons originally introduced by Hirota and Pedrycz. Being treated here as generic processing units, OR/AND neurons are neurofuzzy constructs that exhibit well-defined logic characteristics and are endowed with a high level of parametric flexibility and come with significant interpretation abilities. The composite logic nature of the logic neurons becomes instrumental in covering a broad spectrum of logic dependencies whose character spread in-between between those being captured by plain and and or logic descriptors (connectives). From the functional standpoint, the developed network realizes a logic approximation of multidimensional mappings between unit hypercubes, that is transformations from [0, 1]n to [0, 1]m. The way in which the structure of the network has been formed is highly modular and becomes reflective of a general concept of decomposition of logic expressions and Boolean functions (as being commonly encountered in two-valued logic). In essence, given a collection of input variables, selected is their subset and transformed into new composite variable, which in turn is used in the consecutive module of the network. These intermediate synthetic variables are the result of the successive problem (mapping) decomposition. The development of the network is realized through genetic optimization. This helps address important issues of structural optimization (where we are concerned with a selection of a subset of variables and their allocation within the network) and reaching a global minimum when carrying out an extensive parametric optimization (adjustments of the connections of the neurons). The paper offers a comprehensive and user-interactive design procedure including a simple pruning mechanism whose intention is to enhance the interpretability of the network while reducing its size. The experimental studies comprise of three parts. First, we demonstrate the performance of the network on Boolean data (that leads to some useful comparative observations considering a wealth of optimization tools available in two-valued logic and digital systems). Second, we discuss synthetic multivalued data that helps focus on the approximation abilities of the network. Finally, show the generation of logic expressions describing selected data sets coming from the machine learning repository.

IEEE Engineering in Medicine and Biology Magazine, 2007

Canada. His research interests are in the areas of application of computational intelligence tech... more Canada. His research interests are in the areas of application of computational intelligence techniques, such as neurofuzzy systems and evolutionary computing, as well as probabilistic and evidence theories to intelligent data analysis and modeling leading to translating data into knowledge. He applies these methods to conduct research in the areas of software and knowledge engineering. Dr. Reformat has been a member of program committees of several conferences related to computational intelligence and evolutionary computing. He is a member of the IEEE and ACM.

Journal of Artificial Intelligence Research, 2009

In a significant minority of cases, certain pronouns, especially the pronoun it, can be used with... more In a significant minority of cases, certain pronouns, especially the pronoun it, can be used without referring to any specific entity. This phenomenon of pleonastic pronoun usage poses serious problems for systems aiming at even a shallow understanding of natural language texts. In this paper, a novel approach is proposed to identify such uses of it: the extrapositional cases are identified using a series of queries against the web, and the cleft cases are identified using a simple set of syntactic rules. The system is evaluated with four sets of news articles containing 679 extrapositional cases as well as 78 cleft constructs. The identification results are comparable to those obtained by human efforts.

Effectiveness and clarity of software objects, their adherence to coding standards and programmin... more Effectiveness and clarity of software objects, their adherence to coding standards and programming habits of programmers are important features of overall quality of software systems. This paper proposes an approach towards a quantitative software quality assessment with respect to extensibility, reusability, clarity and efficiency. It exploits techniques of Computational Intelligence (CI) that are treated as a consortium of granular computing, neural networks and evolutionary techniques. In particular, we take advantage of self-organizing maps to gain a better insight into the data, and study genetic decision trees-a novel algorithmic framework to carry out classification of software objects with respect to their quality. Genetic classifiers serve as a "quality filter" for software objects. Using these classifiers, a system manager can predict quality of software objects and identify low quality objects for review and possible revision. The approach is applied to an object-oriented visualization-based software system for biomedical data analysis

Journal of Systems Architecture, 2007

A scheme for time and power efficient embedded system design, using hardware and software compone... more A scheme for time and power efficient embedded system design, using hardware and software components, is presented. Our objective is to reduce the execution time and the power consumed by the system, leading to the simultaneous multiobjective minimization of time and power. The goal of suitably partitioning the system into hardware and software components is achieved using Genetic Algorithms (GA). Multiple tests were conducted to confirm the consistency of the results obtained and the versatile nature of the objective functions. An enhanced resource constrained scheduling algorithm is used to determine the system performance. To emulate the characteristics of practical systems, the influence of inter-processor communication is examined. The suitability of introducing a reconfigurable hardware resource over pre-configured hardware is explored for the same objectives. The distinct difference in the task to resource mapping with the variation in design objective is studied. Further, the procedure to allocate optimal number of resources based on the design objective is proposed. The implementation is constrained for power and time individually, with GA being used to arrive at the resource count to suit the objective. The results obtained are compared by varying the time and power constraints. The test environment is developed using randomly generated task graphs. Exhaustive sets of tests are performed on the set design objectives to validate the proposed solution.

Ability to provide convenient access to scientific documents becomes a difficult problem due to l... more Ability to provide convenient access to scientific documents becomes a difficult problem due to large and constantly increasing number of incoming documents and extensive manual work associated with their storage, description and classification. This requires intelligent search and classification capabilities for users to find required information. It is especially true for repositories of scientific medical articles due to their extensive use, large size and number of new documents, and well maintained structure. This research aims to provide an automated method for classification of articles into the structure of medical document repositories, which would support currently performed extensive manual work. The proposed method classifies articles from the largest medical repository, MEDLINE, using state of the art data mining technology. The method is based on a novel associative classification technique which considers recurrent items and most importantly multi-label characteristic of the MEDLINE data. Based on large scale experiments that utilize 350,000 documents several different classification algorithms have been compared including both recurrent and non-recurrent associative classification. The algorithms are capable of assigning each medical document to several classes (multi-label classification) and are characterized by relatively high accuracy. We also investigate different measures of classification quality and point out pros and cons of each. Based on experimental result we show that recurrent item based associative classification demonstrates superior performance and propose three alternative setups that allow the user to obtain different desired classification qualities.

Electronics Letters, Nov 25, 2004

Soft Computing, 2008

Empirical Software Engineering, 2007

Information & Software Technology, 2006

International Journal on Semantic Web and Information Systems, 2008

Soft Computing, 2008

IEEE Transactions on Fuzzy Systems, 1997

Data & Knowledge Engineering, 2008

Fuzzy Sets and Systems, 2004

Information & Software Technology, 2003

IEEE Transactions on Neural Networks, 2006

IEEE Engineering in Medicine and Biology Magazine, 2007

Journal of Artificial Intelligence Research, 2009

Journal of Systems Architecture, 2007