A Novel Hybrid Linear/Nonlinear Classifier for Two-Class Classification: Theory, Algorithm, and Applications (original) (raw)

Linear vs. quadratic discriminant analysis classifier: a tutorial

The aim of this paper is to collect in one place the basic background needed to understand the discriminant analysis (DA) classifier to make the reader of all levels be able to get a better understanding of the DA and to know how to apply this classifier in different applications. This paper starts with basic mathematical definitions of the DA steps with visual explanations of these steps. Moreover, in a step-by-step approach, a number of numerical examples were illustrated to show how to calculate the discriminant functions and decision boundaries when the covariance matrices of all classes were common or not. The singularity problem of DA was explained and some of the state-of-the-art solutions to this problem were highlighted with numerical illustrations. An experiment is conducted to compare between the linear and quadratic classifiers and to show how to solve the singularity problem when high-dimensional datasets are used. Reference to this paper should be made as follows: Tharwat, A. (2016) 'Linear vs. quadratic discriminant analysis classifier: a tutorial', Int.

Linear discriminant analysis and support vector machines for classifying breast cancer

IAES International Journal of Artificial Intelligence, 2021

Breast cancer is an abnormal cell growth in the breast that keeps changed uncontrolled and it forms a tumor. The tumor can be benign or malignant. Benign could not be dangerous to health and cancerous, but malignant could be has a probability dangerous to health and be cancerous. A specialist doctor will diagnose the patient and give treatment based on the diagnosis which is benign or malignant. Machine learning offer times efficiency to determine a cancer cell. The machine will learn the pattern based on the information from the dataset. Support vector machines and linear discriminant analysis are common methods that can be used in the classification of cancer. In this study, both of linear discriminant analysis and support vector machines are compared by looking from accuracy, sensitivity, specificity, and F1-score. We will know which methods are better in classifying breast cancer dataset. The result shows that the support vector machine has better performance than the linear discriminant analysis. It can be seen from the accuracy is 98.77%.

Roc Curve Analysis of Different Hybrid Feature Descriptors Using Multi Classifiers

ASEAN Engineering Journal

Tremendous success of machine learning algorithms at pattern recognition creates interest in new inventions. Machine learning in an era of big data is that significant hierarchical relationships within the data can be discovered algorithmically than other handcraft like features. In this study, Convolutional Neural Network (CNN) is used as feature descriptors in pulmonary malignancy prediction. Various feature descriptors such as Histogram of Oriented Gradient (HOG), Extended Histogram of Oriented Gradient (EXHOG) and Linear Binary Pattern (LBP) descriptors are analyzed with classifiers such as Random Forest (RF), Decision Tree (DT), K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) for Computed Tomography (CT) The phenotype features of pulmonary nodules are important cues for identification. The nodule solidity is an important cue for white blob area identification. The method is analyzed in Lung Image Database Consortium (LIDC) dataset. Receivers Operating Characteristics...

Principal component analysis, classifier complexity, and robustness of sonographic breast lesion classification

Medical Imaging 2009: Computer-Aided Diagnosis, 2009

We investigated three classifiers for the task of distinguishing between benign and malignant breast lesions. Classification performance was measured in terms of area under the ROC curve (AUC value). We compared linear discriminant analysis (LDA), quadratic discriminant analysis (QDA) and a Bayesian neural net (BNN) with 5 hidden units. For each lesion, 46 image features were extracted and principal component analysis (PCA) of these features was used as classifier input. For each classifier, the optimal number of principal components was determined by performing PCA within each step of a leave-one-case-out protocol for the training dataset (1125 lesions, 14% cancer prevalence) and determining which number of components maximized the AUC value. Subsequently, each classifier was trained on the training dataset and applied 'cold turkey' to an independent test set from a different population (341 lesions, 30% cancer prevalence). The optimal number of principal components for LDA was 24, accounting for 97% of the variance in the image features. For QDA and BNN, these numbers were 5 (70%) and 15 (93%), respectively. The LDA, QDA and BNN obtained AUC values of 0.88, 0.85, and 0.91, respectively, in the leave-one-case-out analysis. In the independent test -with AUCs of 0.88, 0.76, and 0.82 -only LDA achieved performance identical to that for the training set (lower bound of 95% non-inferiority interval -.0067), while the others performed significantly worse (p-values << 0.05). While the more complex BNN classifier outperformed the others in leave-one-case-out of a large dataset, LDA was the robust best-performer in an independent test.

IRJET- Mammogram Images Classification using Linear Discriminant Analysis Technique

IRJET, 2020

Breast cancer represents a significant percentage of cancer death among women all over the world. Studies showed that breast cancer possibility of cure can increase up to 40%, if it detected still in early stage. The purpose of this paper is to use and apply a machine learning approach for mammogram image classification prediction in each image by classifying mammogram image to determine either it is benign breast tumors or malignant ones and help doctors to detect the disease in its early stage. This paper used Linear Discriminant Analysis (LDA classifier) and six statistical features that extracted from MIAS dataset, the dataset splits into two parts training and testing. After trained the classifier based on training set, the classifier tested based on the test set to determine the accuracy of the classifier depend on the confusion matrix. The paper emphasis of five phases starting by collecting images, preprocessing, features extracting, classification and end with testing and evaluating. The result of the proposed method empirically comes as 0.81% accuracy when using percentage of 85% of data set for training and 15% of data set for testing. Due to the increasing applications of ML methods in breast cancer research, we presented here an applied study of the Linear Discriminant Analysis method in the classification of mammogram images. Consequently, this study can assist in building computer aided diagnosis (CAD) systems in early detection of breasts Cancer .Consequently, we have contributed to this important field for biomedical research that may reduce the risk of late breast effects cancer.

Computer aided diagnosis in digital mammography using combined support vector machine and linear discriminant analyasis classification

2009

This paper presents a computer-aided diagnosis (CAD) system based on combined support vector machine (SVM) and linear discriminant analysis (LDA) classifier for detection and classification breast cancer in digital mammograms. The proposed system has been implemented in four stages: (a) Region of interest (ROI) selection of 32×32 pixels size which identifies suspicion regions, (b) Feature extraction stage locally processed image (ROI) to compute the important features of each breast cancer, (c) Feature selection stage by using forward stepwise linear regression method (FSLR). (d) Classification stage, which classify between normal and abnormal patterns and then classify between benign and malignant abnormal. In classification stage, a new method was used, based on combined SVM and LDA classifier (SVM/LDA), and compared to other classifiers such as SVM, LDA, and fuzzy C-mean (FCM) classifiers. The proposed system was shown to have a large potential for breast cancer diagnostic in digital mammograms.

New statistical learning theory paradigms adapted to breast cancer diagnosis/classification using image and non-image clinical data

International Journal of Functional Informatics and Personalised Medicine, 2008

The automated decision paradigms presented in this work address the false positive (FP) biopsy occurrence in diagnostic mammography. An EP/ES stochastic hybrid and two kernelized Partial Least Squares (K-PLS) paradigms were investigated with following studies: • methodology performance comparisons • automated diagnostic accuracy assessments with two data sets. The findings showed: • the new hybrid produced comparable results more rapidly • the new K-PLS paradigms train and operate Essentially in real time for the data sets studied.

A fast kernel-based nonlinear discriminant analysis for multi-class problems

Pattern Recognition, 2006

Nonlinear discriminant analysis may be transformed into the form of kernel-based discriminant analysis. Thus, the corresponding discriminant direction can be solved by linear equations. From the view of feature space, the nonlinear discriminant analysis is still a linear method, and it is provable that in feature space the method is equivalent to Fisher discriminant analysis. We consider that one linear combination of parts of training samples, called "significant nodes", can replace the total training samples to express the corresponding discriminant vector in feature space to some extent. In this paper, an efficient algorithm is proposed to determine "significant nodes" one by one. The principle of determining "significant nodes" is simple and reasonable, and the consequent algorithm can be carried out with acceptable computation cost. Depending on the kernel functions between test samples and all "significant nodes", classification can be implemented. The proposed method is called fast kernel-based nonlinear method (FKNM). It is noticeable that the number of "significant nodes" may be much smaller than that of the total training samples. As a result, for two-class classification problems, the FKNM will be much more efficient than the naive kernel-based nonlinear method (NKNM). The FKNM can be also applied to multi-class via two approaches: one-against-the-rest and one-against-one. Although there is a view that one-against-one is superior to one-against-the-rest in classification efficiency, it seems that for the FKNM one-against-the-rest is more efficient than one-against-one. Experiments on benchmark and real datasets illustrate that, for two-class and multi-class classifications, the FKNM is effective, feasible and much efficient.