A new mathematical programming approach to multi-group classification problems (original) (raw)
Related papers
A Combining Mathematical Programming Method For Multi-Group Data Classification
2011
In spite of the abundance of articles on mathematical programming models to the two-group classification problem, very few have addressed the multi-group classification problem using mathematical programming. This study presents a new multi-group data classification method based on mathematical programming. A new multi-group data classification model is proposed in this study that includes the strong properties of the mathematical programming models previously suggested for multi-group classification problems in the literature. The efficiency of proposed approach is tested on the well-known IRIS data set. The results on the IRIS data set show that our proposed method is usability and efficient on multi-group classification problems.
Determining the Efficacy of Mathematical Programming Approaches for Multi-Group Classification
2009
Managers have been grappling with the problem of extracting patterns out of the vast database generated by their systems. The advent of powerful information systems in organizations and the consequent agglomeration of vast pool of data since the mid-1980s have created renewed interest in the usefulness of discriminant analysis (DA). Expert systems have come to the aid of managers in their day-today decision making with many successful applications in financial planning, sales management, and other areas of business operations (Erenguc and Koehler 1990). Currently, no comprehensive research study exists that tests the robustness of multi-group classification analysis. Our research aims to bridge the gaps in the existing works and take a step further by extending our study to four-group classification problems. The main purpose of this research is to determine the efficacy of mathematical programming classification models, more specifically, LP methods vis-à-vis statistical approaches such as discriminant analysis (Mahalanobis) and logistic regression, an artificial intelligence (AI) technique such as a neural network, and a non-parametric technique such as knearest neighborhood (k-NN) for four-group classification problems. This research also iii proposes an integrated (hybrid) model that combines a non-parametric classification technique and a LP approach to enhance the overall classification performance. Furthermore, the study extends an existing two-group LP model (Bal et al. 2006) based on the work of (Lam and Moy 1996b) and apply it to four-group classification problems. These models are tested through robust computational experiments under varying data conditions using a financial product example. The characteristics of a real dataset are used to simulate (Monte Carlo method) multiple sample runs for four group classification problems with three continuous independent variables. The experimental results show that LP approaches in general and the proposed integrated method in particular consistently have lower misclassification rates for most data characteristics. Furthermore, the integrated method utilizes the strengths of both the methods: k-NN and linear programming, thereby considerably improving the classification accuracy. enriching experience. Without the support, patience, and guidance of many people, this study could not have been completed. I express my deepest gratitude to my dissertation chair, Professor Kenneth D. Lawrence who went beyond the call of duty in ensuring the completion of this work. I have immensely benefitted from his advice and experience, on both academic and personal fronts. But for his humorous demeanor, it would not have been possible for me to circumvent the stress levels common with graduate students. I take this opportunity to thank Professor Ronald D. Armstrong, whom I have known from day zero of the PhD program, for his patience and guidance throughout this program, and for acceding to my request to be on my committee. It would have been difficult to traverse this path without his understanding, help and generosity. Professor Ronald K. Klimberg not only provided important advice regarding computational experiments, but more importantly, advice related to career planning. I have highly benefitted from the numerous erudite conversations I have had with him pertaining this work. Thank you. Thanks are also due to Professor Sheila M. Lawrence for her encouragement and her ever willingness to help with editing draft papers and dissertation. I am confident my skill in this area is much better than it was at the start of this program. Professor Lei Lei has been a tremendous source of inspiration and help throughout my stay at Rutgers. I would like to especially thank her and the Rutgers v Supply Chain Center for the generous support for procuring software's required for completing this study. She is indeed a class act. The PhD program office has been of great help all these years. I would like to especially thank Goncalo Filipe for his 'infinite' patience and helpful nature. I would also like to thank my colleagues:
Computers & Industrial Engineering, 2006
The aim of this article is to consider a new linear programming and two goal programming models for two-group classification problems. When these approaches are applied to the data of real life or of simulation, our proposed new models perform well both in separating the groups and the group-membership predictions of new objects. In discriminant analysis some linear programming models determine the attribute weights and the cut-off value in two steps, but our models determine simultaneously all of these values in one step. Moreover, the results of simulation experiments show that our proposed models outperform significantly than existing linear programming and statistical approaches in attaining higher average hit-ratios.
Three Group Classification Problem Approach Based on Fuzzy Goal Programming
Politeknik Dergisi, 2019
In this study, a new fuzzy logic and mathematical programming based model was proposed to solve three-group classification problem. Determination of cut-off value, which corresponds to discrimination axis in classification problems, has importance. Status of the cut-off value such as asymmetric triangle fuzzy number, trapezoid fuzzy number and gauss fuzzy number was examined. The proposed approach displayed better performance when compared to Fisher's Linear Discriminant Function and some mathematical programming-based models by using three group data sets used frequently in the literature.
Mathematical programming formulations for two-group classification with binary variables
Annals of Operations Research, 1997
In this paper, we introduce a nonparametric mathematical programming (MP) approach for solving the binary variable classification problem. In practice, there exists a substantial interest in the binary variable classification problem. For instance, medical diagnoses are often based on the presence or absence of relevant symptoms, and binary variable classification has long been used as a means to predict (diagnose) the nature of the medical condition of patients. Our research is motivated by the fact that none of the existing statistical methods for binary variable classification, parametric and nonparametric alike, are fully satisfactory. The general class of MP classification methods facilitates a geometric interpretation, and MP-based classification rules have intuitive appeal because of their potentially robust properties. These intuitive arguments appear to have merit, and a number of research studies have confirmed that MP methods can indeed yield effective classification rules under certain non-normal data conditions, for instance if the data set is outlier-contaminated or highly skewed. However, the MP-based approach in general lacks a probabilistic foundation, necessitating an ad hoc assessment of its classification performance. Our proposed nonparametric mixed integer programming (MIP) formulation for the binary variable classification problem not only has a geometric interpretation, but also is Bayes inspired. Therefore, our proposed formulation possesses a strong probabilistic foundation. We also introduce a linear programming (LP) formulation which parallels the concepts underlying the MIP formulation, but does not possess the decision theoretic justification. An additional advantage of both our LP and MIP formulations is that, due to the fact that the attribute variables are binary, the training sample observations can be partitioned into multinomial cells, allowing for a substantial reduction in the number of binary and deviational variables, so that our formulation can be used to analyze training samples of almost any size. We illustrate our formulations using an example problem, and use three real data sets to compare its classification performance with a variety of parametric and non-parametric statistical methods. For each of these data sets, our proposed formulation yields the minimum possible number of misclassifications, both using the resubstitution and the
Discriminant Analysis is a method for determining group classifications for a set of similar units or observations. A number of new efficient mathematical programming approaches have been developed as an al-ternative to examining classification problems using statistical models. In this study two new mathematical programming approaches are devel-oped for the minimization of the sum of the deviations and the concept of relative efficiency for Data Envelopment Analysis when solving the two group classification problem. The efficiency and practicability of the suggested approaches are supported with a simulation study involv-ing three different distributions and different cases for the units in the groups.
Mathematical Programming Approaches to Classification Problems
Discriminant Analysis DA is widely applied in many fields. Some recent researches raise the fact that standard DA assumptions, such as a normal distribution of data and equality of the variancecovariance matrices, are not always satisfied. A Mathematical Programming approach MP has been frequently used in DA and can be considered a valuable alternative to the classical models of DA. The MP approach provides more flexibility for the process of analysis. The aim of this paper is to address a comparative study in which we analyze the performance of three statistical and some MP methods using linear and nonlinear discriminant functions in two-group classification problems. New classification procedures will be adapted to context of nonlinear discriminant functions. Different applications are used to compare these methods including the Support Vector Machines- SVMs- based approach. The findings of this study will be useful in assisting decisionmakers to choose the most appropriate model for their decision-making situation.
Multi-Group Classification Using Interval Linear Programming
Among various statistical and data mining discriminant analysis proposed so far for group classification, linear programming discriminant analysis has recently attracted the researchers’ interest. This study evaluates multi-group discriminant linear programming (MDLP) for classification problems against well-known methods such as neural networks and support vector machine. MDLP is less complicated as compared to other methods and does not suffer from having local optima. This study also proposes a fuzzy Delphi method to select and gather the required data, when databases suffer from deficient data. In addition, to absorb the uncertainty infused to collecting data, interval MDLP (IMDLP) is developed. The results show that the performance of MDLP and specially IMDLP is better than conventional classification methods with respect to correct classification, at least for small and medium-size datasets.
Linear goal programming in estimation of classification probability
European Journal of Operational Research, 1993
Recently, linear programming approaches to classification problems have been widely studied and have been shown to yield satisfactory results. However, those linear programming models do not incorporate discriminating information about objects within groups. In order to utilize within group variations, an approach is presented which directly incorporates membership probabilities in a linear goal programming model.