… Analysis Approach to Two-Group Classification Problems and an Experimental Comparison with Some Classification Models (original) (raw)

Data Envelopment Analysis Approach to Two-Group Classification Problems and an Experimental Comparison with Some Classification Models

2000

Discriminant Analysis is a method for determining group classifications for a set of similar units or observations. A number of new efficient mathematical programming approaches have been developed as an alternative to examining classification problems using statistical models. In this study two new mathematical programming approaches are developed for the minimization of the sum of the deviations and the concept of relative efficiency for Data Envelopment Analysis when solving the two group classification problem. The efficiency and practicability of the suggested approaches are supported with a simulation study involving three different distributions and different cases for the units in the groups.

A Unified Mathematical Programming Formulation for the Discriminant Problem

1989

The purpose of classification analysis is to predict the group membership of individuals or observations based on limited information about the group characteristics. The resulting classification or discriminant rules provide a powerful methodology in decision analysis. In fact, classification analysis has been touted as one of the most significant tools to analyze scientific and behavioral data. Applications of discriminant analysis can be found in such diverse fields as predicting bank failures, artificial intelligence, medical diagnosis, psychology, biology and credit granting. The most widely used statistical techniques are based on the assumption of multivariate normality. Frequently, this assump tion is violated and nonparametric techniques are appropriate. One such technique which was recently proposed uses mathematical programming formulations of the problem. This paper introduces a unified mathematical programming-based approach to the two-group discriminant problem which does not suffer from many of the theoretical inadequacies that have plagued previously proposed formulations. Moreover, the formulation appears to be simple, making it a promising contribution from both a theory and practice viewpoint.

Mathematical Programming Approaches to Classification Problems

Discriminant Analysis DA is widely applied in many fields. Some recent researches raise the fact that standard DA assumptions, such as a normal distribution of data and equality of the variancecovariance matrices, are not always satisfied. A Mathematical Programming approach MP has been frequently used in DA and can be considered a valuable alternative to the classical models of DA. The MP approach provides more flexibility for the process of analysis. The aim of this paper is to address a comparative study in which we analyze the performance of three statistical and some MP methods using linear and nonlinear discriminant functions in two-group classification problems. New classification procedures will be adapted to context of nonlinear discriminant functions. Different applications are used to compare these methods including the Support Vector Machines- SVMs- based approach. The findings of this study will be useful in assisting decisionmakers to choose the most appropriate model for their decision-making situation.

An Alternative Model to Fisher and LinearProgramming Approaches in Two-Group Classification Problem: Minimizing Deviations from the Group Median (Review)

2006

In this study, new classification models were developed which can be used in the solution to the problems of Discriminant Analysis having two groups. For the solution of these type of problems, Lam, Choo and Moy (1996) proposed a model regarding the minimization of deviations from the group means. The model examined by these authors loses its efficiency in respect of the hit ratio as the distributions of populations of samples considered go away from the normal distribution. For the samples drawn from non normal or skewed distributions, the median is a much more suitable descriptive statistic than the mean. The aim of the study is to consider the models of two-group classification problems by minimizing the deviations from the group medians. When these proposed approaches are applied to the data of real life or of simulation drawn from different distributions, it is observed that the attained performance of classification is better than both some important classification approaches ...

EVALUATING ALTERNATIVE LINEAR PROGRAMMING MODELS TO SOLVE THE TWO-GROUP DISCRIMINANT PROBLEM

Decision Sciences, 1986

The two-group discriminant problem has applications in many areas, for example, differentiating between good credit risks and poor ones, between promising new firms and those likely to fail, or between patients with strong prospects for recovery and those highly at risk. To expand our tools for dealing with such problems, we propose a class of nonpara-metric discriminant procedures based on linear programming (LP). Although these procedures have attracted considerable attention recently, only a limited number of computational studies have examined the relative merits of alternative formulations. In this paper we provide a detailed study of three contrasting formulations for the two-group problem. The experimental design provides a variety of test conditions involving both normal and nonnormal populations. Our results establish the LP model which seeks to minimize the sum of deviations beyond the two-group boundary as a promising alternative to more conventional linear discriminant techniques.

An experimental comparison of the new goal programming and the linear programming approaches in the two-group discriminant problems

Computers & Industrial Engineering, 2006

The aim of this article is to consider a new linear programming and two goal programming models for two-group classification problems. When these approaches are applied to the data of real life or of simulation, our proposed new models perform well both in separating the groups and the group-membership predictions of new objects. In discriminant analysis some linear programming models determine the attribute weights and the cut-off value in two steps, but our models determine simultaneously all of these values in one step. Moreover, the results of simulation experiments show that our proposed models outperform significantly than existing linear programming and statistical approaches in attaining higher average hit-ratios.

An Application of Linear Programming Discriminated Analysis for Classification

التجارة والتمويل

The goal of this study is to compare linear discrimination analysis and discriminated analysis with linear programming (MMD) (Min. Sum of Deviation) in order to find the best model for classifying observations into their correct groups with the lowest possible classification error and highest classification accuracy. According to the findings of the study, discriminated analysis using linear programming differs from linear discriminated analysis in data classification because it produces the lowest error rate and the highest classification accuracy rate, and it does not require the linear discriminated analysis assumptions.

Determining the Efficacy of Mathematical Programming Approaches for Multi-Group Classification

2009

Managers have been grappling with the problem of extracting patterns out of the vast database generated by their systems. The advent of powerful information systems in organizations and the consequent agglomeration of vast pool of data since the mid-1980s have created renewed interest in the usefulness of discriminant analysis (DA). Expert systems have come to the aid of managers in their day-today decision making with many successful applications in financial planning, sales management, and other areas of business operations (Erenguc and Koehler 1990). Currently, no comprehensive research study exists that tests the robustness of multi-group classification analysis. Our research aims to bridge the gaps in the existing works and take a step further by extending our study to four-group classification problems. The main purpose of this research is to determine the efficacy of mathematical programming classification models, more specifically, LP methods vis-à-vis statistical approaches such as discriminant analysis (Mahalanobis) and logistic regression, an artificial intelligence (AI) technique such as a neural network, and a non-parametric technique such as knearest neighborhood (k-NN) for four-group classification problems. This research also iii proposes an integrated (hybrid) model that combines a non-parametric classification technique and a LP approach to enhance the overall classification performance. Furthermore, the study extends an existing two-group LP model (Bal et al. 2006) based on the work of (Lam and Moy 1996b) and apply it to four-group classification problems. These models are tested through robust computational experiments under varying data conditions using a financial product example. The characteristics of a real dataset are used to simulate (Monte Carlo method) multiple sample runs for four group classification problems with three continuous independent variables. The experimental results show that LP approaches in general and the proposed integrated method in particular consistently have lower misclassification rates for most data characteristics. Furthermore, the integrated method utilizes the strengths of both the methods: k-NN and linear programming, thereby considerably improving the classification accuracy. enriching experience. Without the support, patience, and guidance of many people, this study could not have been completed. I express my deepest gratitude to my dissertation chair, Professor Kenneth D. Lawrence who went beyond the call of duty in ensuring the completion of this work. I have immensely benefitted from his advice and experience, on both academic and personal fronts. But for his humorous demeanor, it would not have been possible for me to circumvent the stress levels common with graduate students. I take this opportunity to thank Professor Ronald D. Armstrong, whom I have known from day zero of the PhD program, for his patience and guidance throughout this program, and for acceding to my request to be on my committee. It would have been difficult to traverse this path without his understanding, help and generosity. Professor Ronald K. Klimberg not only provided important advice regarding computational experiments, but more importantly, advice related to career planning. I have highly benefitted from the numerous erudite conversations I have had with him pertaining this work. Thank you. Thanks are also due to Professor Sheila M. Lawrence for her encouragement and her ever willingness to help with editing draft papers and dissertation. I am confident my skill in this area is much better than it was at the start of this program. Professor Lei Lei has been a tremendous source of inspiration and help throughout my stay at Rutgers. I would like to especially thank her and the Rutgers v Supply Chain Center for the generous support for procuring software's required for completing this study. She is indeed a class act. The PhD program office has been of great help all these years. I would like to especially thank Goncalo Filipe for his 'infinite' patience and helpful nature. I would also like to thank my colleagues:

W O R K I N G P A P E R A UNIFIED MATHEMATICAL PROGRAMMING FORMULATION FOR THE DISCRIMINANT PROBLEM

The purpose of classification analysis is to predict the group membership of individuals or observations based on limited information about the group characteristics. The resulting classification or discriminant rules provide a powerful methodology in decision analysis. In fact, classification analysis has been touted as one of the most significant tools to analyze scientific and behavioral data. Applications of discriminant analysis can be found in such diverse fields as predicting bank failures, artificial intelligence, medical diagnosis, psychology, biology and credit granting. The most widely used statistical techniques are based on the assumption of multivariate normality. Frequently, this assump tion is violated and nonparametric techniques are appropriate. One such technique which was recently proposed uses mathematical programming formulations of the problem.

Linear Programming Approaches for Multiple-Class Discriminant and Classification Analysis

International Journal of Strategic Decision Sciences, 2010

New linear programming approaches are proposed as nonparametric procedures for multiple-class discriminant and classification analysis. A new MSD model minimizing the sum of the classification errors is formulated to construct discriminant functions. This model has desirable properties because it is versatile and is immune to the pathologies of some of the earlier mathematical programming models for two-class classification. It is also purely systematic and algorithmic and no user ad hoc and trial judgment is required. Furthermore, it can be used as the basis to develop other models, such as a multiple-class support vector machine and a mixed integer programming model, for discrimination and classification. A MMD model minimizing the maximum of the classification errors, although with very limited use, is also studied. These models may also be considered as generalizations of mathematical programming formulations for two-class classification. By the same approach, other mathematical...