A polygonal line algorithm for constructing principal curves (original) (raw)

Principal Curves for Nonlinear Feature Extraction and Classification

… Applications of Artificial Neural Networks in …, 1998

We present an improved unbiased algorithm for determining principal curves in high dimensional spaces, and then propose two novel applications of principal curve to feature extraction and pattern classification–the Principal Curve Feature Extractor (PCFE) and the Principal ...

Learning the Similarity Preserving Principal Curves

2009

The theory of Similarity Preserving Principal Curves (also called Principal Curves with Feature Continuity) has been studied. In this paper, we proposed a practical learning algorithm for learning the Similarity Preserving Principal Curves for a general data set. Furthermore, we proposed the concept and learning algorithms of the second-order Similarity Preserving Principal Curves. The learning algorithms are employed to extract efficient features for the data representation tasks. Experimental results show the ability of the proposed learning model and algorithms.

A k-segments algorithm for finding principal curves

Pattern Recognition Letters, 2002

We propose an incremental method to find principal curves. Line segments are fitted and connected to form polygonal lines (PLs). New segments are inserted until a performance criterion is met. Experimental results illustrate the performance of the method compared to other existing approaches.

Construction Algorithm of Principal Curves in the Sense of Limit

Lecture Notes in Computer Science, 2005

Principal curves have been defined as self-consistent, smooth, onedimensional curves which pass through the middle of a multidimensional data set. They are nonlinear generalization of the first Principal Component. In this paper, we take a new approach by defining principal curves as continuous curves based on the local tangent space in the sense of limit. It is proved that this new principal curves not only satisfy the self-consistency property, but also are the unique existence for any given open covering. According to the new definition, a new practical algorithm for constructing principal curves is given too. And the convergence properties of this algorithm are analyzed. The new construction algorithm of principal curves is illustrated on some simulated data sets.

An algorithm for generalized principal curves with adaptive topology in complex data sets

1997

Abstract. Generalized principal curves are capable of representing complex data structures as they may have branching points or may consist of disconnected parts. For their construction using an unsupervised learning algorithm the templates need to be structurally adaptive. The present algorithm meets this goal by a combination of a competitive Hebbian learning scheme and a self-organizing map algorithm.

A Length Penalized Probabilistic Principal Curve Algorithm With Applications To Handwritten Digits And Pharmacologic Colon Imaging

2020

The classical Principal Curve algorithm was developed as a nonlinear version of principal component analysis to model curves. However, existing principal curve algorithms with classical penalties, such as smoothness or ridge penalties, lack the ability to deal with complex curve shapes. In this manuscript, we introduce a robust and stable length penalty which solves issues of unnecessary curve complexity, such as the self-looping, that arise widely in principal curve algorithms. A novel probabilistic mixture regression model is formulated. A modified penalized EM(Expectation Maximization) Algorithm was applied to the model to obtain the penalized MLE. Two applications of the algorithm were performed. In the first, the algorithm was applied to the MNIST dataset of handwritten digits to find the centerline, not unlike defining a TrueType font. We demonstrate that the centerline can be recovered with this algorithm. In the second application, the algorithm was applied to construct a th...

An Effective Principal Curves Extraction Algorithm for Complex Distribution Dataset

Lecture Notes in Computer Science, 2010

This paper proposes a new method for finding principal curves from complex distribution dataset. Motivated by solving the problem, which is that existing methods did not perform well on finding principal curve in complex distribution dataset with high curvature, high dispersion and self-intersecting, such as spiral-shaped curves, Firstly, rudimentary principal graph of data set is created based on the thinning algorithm, and then the contiguous vertices are merged. Finally the fitting-and-smoothing step introduced by Kégl is improved to optimize the principal graph, and Kégl's restructuring step is used to rectify imperfections of principal graph. Experimental results indicate the effectiveness of the proposed method on finding principal curves in complex distribution dataset.

Data clustering based on principal curves

Advances in Data Analysis and Classification, 2019

In this contribution we present a new method for data clustering based on principal curves. Principal curves consist of a nonlinear generalization of principal component analysis and may also be regarded as continuous versions of 1D self-organizing maps. The proposed method implements the k-segment algorithm for principal curves extraction. Then, the method divides the principal curves into two or more curves, according to the number of clusters defined by the user. Thus, the distance between the data points and the generate curves is calculated and, afterwards, the classification is performed according to the smallest distance found. The method was applied to nine databases with different dimensionality and number of classes. The results were compared with three clustering algorithms: the k-means algorithm and the 1-D and 2-D self-organizing map algorithms. Experiments show that the method is suitable for clusters with elongated and spherical shapes and achieved significantly better results in some data sets than other clustering algorithms used in this work.

Another Look at Principal Curves and Surfaces

Journal of Multivariate Analysis, 2001

Principal curves have been defined as smooth curves passing through thè`m iddle'' of a multidimensional data set. They are nonlinear generalizations of the first principal component, a characterization of which is the basis of the definition of principal curves. We establish a new characterization of the first principal component and base our new definition of a principal curve on this property. We introduce the notion of principal oriented points and we prove the existence of principal curves passing through these points. We extend the definition of principal curves to multivariate data sets and propose an algorithm to find them. The new notions lead us to generalize the definition of total variance. Successive principal curves are recursively defined from this generalization. The new methods are illustrated on simulated and real data sets.