Hypergraph based semi-supervised learning algorithms applied to speech recognition problem: a novel approach (original) (raw)

Un-normalized hypergraph p-Laplacian based semi-supervised learning methods

2018

Most network-based machine learning methods assume that the labels of two adjacent samples in the network are likely to be the same. However, assuming the pairwise relationship between samples is not complete. The information a group of samples that shows very similar pattern and tends to have similar labels is missed. The natural way overcoming the information loss of the above assumption is to represent the feature dataset of samples as the hypergraph. Thus, in this paper, we will present the un-normalized hypergraph p-Laplacian semi-supervised learning methods. These methods will be applied to the zoo dataset and the tiny version of 20 newsgroups dataset. Experiment results show that the accuracy performance measures of these un-normalized hypergraph p-Laplacian based semi-supervised learning methods are significantly greater than the accuracy performance measure of the un-normalized hypergraph Laplacian based semi-supervised learning method (the current state of the art method h...

The Un-normalized Graph p-Laplacian based Semi-supervised Learning Method and Speech Recognition Problem

2017

Speech recognition is the classical problem in pattern recognition research field. However, just a few graph based machine learning methods have been applied to this classical problem. In this paper, we propose the un-normalized graph p-Laplacian semi-supervised learning methods and these methods will be applied to the speech network constructed from the MFCC speech dataset to predict the labels of all speech samples in the speech network. These methods are based on the assumption that the labels of two adjacent speech samples in the network are likely to be the same. The experiments show that that the un-normalized graph p-Laplacian semi-supervised learning methods are at least as good as the current state of the art method (the un-normalized graph Laplacian based semi-supervised learning method) but often lead to better classification sensitivity performance measures.

Un-normlized and Random Walk Hypergraph Laplacian Un-supervised Learning

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2015

Most network-based clustering methods are based on the assumption that the labels of two adjacent vertices in the network are likely to be the same. However, assuming the pairwise relationship between vertices is not complete. The information a group of vertices that show very similar patterns and tend to have similar labels is missed. The natural way overcoming the information loss of the above assumption is to represent the given data as the hypergraph. Thus, in this paper, the two unnormalized and random walk hypergraph Laplacian based un-supervised learning methods are introduced. Experiment results show that the accuracy performance measures of these two hypergraph Laplacian based un-supervised learning methods are greater than the accuracy performance measure of symmetric normalized graph Laplacian based un-supervised learning method (i.e. the baseline method of this paper) applied to simple graph created from the incident matrix of hypergraph.

Combinatorial and Random Walk Hypergraph Laplacian Eigenmaps

International Journal of Machine Learning and Computing, 2015

Most network-based machine learning methods are based on the assumption that the labels of two adjacent vertices in the network are likely to be the same. However, assuming the pairwise relationship between vertices is not complete. The information a group of vertices that show very similar patterns and tend to have similar labels is missed. The natural way overcoming the information loss of the above assumption is to represent the given data as the hypergraph. However, representing the dataset as the hypergraph will not lead to the perfection. The number of hyper-edges may be large; hence this will lead to high time complexity of the clustering methods or the classification methods when we try to apply the clustering/classification methods to this hypergraph dataset. Thus, there exists a need to develop the dimensional reduction methods for the hypergraph datasets. In this paper, the two un-normalized and random walk hypergraph Laplacian Eigenmaps are introduced. Experiment results show that the accuracy performance measures of these two hypergraph Laplacian Eigenmaps combined with graph based semi-supervised learning method are greater than the accuracy performance measure of graph based semi-supervised learning method alone (i.e. the baseline method of this paper) applied to the original hypergraph datasets.

Noise-robust classification with hypergraph neural network

Indonesian Journal of Electrical Engineering and Computer Science, 2021

This paper presents a novel version of hypergraph neural network method. This method is utilized to solve the noisy label learning problem. First, we apply the PCA dimensional reduction technique to the feature matrices of the image datasets in order to reduce the "noise" and the redundant features in the feature matrices of the image datasets and to reduce the runtime constructing the hypergraph of the hypergraph neural network method. Then, the classic graph based semisupervised learning method, the classic hypergraph based semi-supervised learning method, the graph neural network, the hypergraph neural network, and our proposed hypergraph neural network are employed to solve the noisy label learning problem. The accuracies of these five methods are evaluated and compared. Experimental results show that the hypergraph neural network methods achieve the best performance when the noise level increases. Moreover, the hypergraph neural network methods are at least as good as the graph neural network.

Directed Hypergraph Neural Network

Journal of Advanced Research in Dynamical and Control Systems

To deal with irregular data structure, graph convolution neural networks have been developed by a lot of data scientists. However, data scientists just have concentrated primarily on developing deep neural network method for un-directed graph. In this paper, we will present the novel neural network method for directed hypergraph. In the other words, we will develop not only the novel directed hypergraph neural network method but also the novel directed hypergraph based semi-supervised learning method. These methods are employed to solve the node classification task. The two datasets that are used in the experiments are the cora and the citeseer datasets. Among the classic directed graph based semi-supervised learning method, the novel directed hypergraph based semisupervised learning method, the novel directed hypergraph neural network method that are utilized to solve this node classification task, we recognize that the novel directed hypergraph neural network achieves the highest accuracies.

Hyperparameter and Kernel Learning for Graph Based Semi-Supervised Classification

2005

There have been many graph-based approaches for semi-supervised classification. One problem is that of hyperparameter learning: performance depends greatly on the hyperparameters of the similarity graph, transformation of the graph Laplacian and the noise model. We present a Bayesian framework for learning hyperparameters for graph-based semisupervised classification. Given some labeled data, which can contain inaccurate labels, we pose the semi-supervised classification as an inference problem over the unknown labels. Expectation Propagation is used for approximate inference and the mean of the posterior is used for classification. The hyperparameters are learned using EM for evidence maximization. We also show that the posterior mean can be written in terms of the kernel matrix, providing a Bayesian classifier to classify new points. Tests on synthetic and real datasets show cases where there are significant improvements in performance over the existing approaches.

Un-Normalized Graph P-Laplacian Semi-Supervised Learning Method Applied to Cancer Classification Problem

Journal of Automation and Control Engineering, 2015

 Abstract-A successful classification of different tumor types is essential for successful treatment of cancer. However, most prior cancer classification methods are clinical-based and have inadequate diagnostic ability. Cancer classification using gene expression data is very important in cancer diagnosis and drug discovery. The introduction of DNA microarray techniques has made simultaneous monitoring of thousands of gene expression probable. With this abundance of gene expression data nowadays, the researchers have the opportunity to do cancer classification using gene expression data. In recent years, a lot of machine learning methods have been proposed to do cancer classification using gene expression data such as clustering-based methods, k-nearest neighbor method, artificial neural network method, and support vector machine method, to name a few. In this paper, we present the un-normalized graph p-Laplacian semisupervised learning methods. These methods will be applied to the patient-patient network constructed from the gene expression data to predict the tumor types of all patients in the network. These methods are based on the assumption that the labels of two adjacent patients in the network are likely to be the same. The experiments show that that the un-normalized graph p-Laplacian semi-supervised learning methods are at least as good as the current state of the art network-based method (the un-normalized graph Laplacian based semi-supervised learning method) but often lead to better classification accuracy performance measures.

Generalized Optimization Framework for Graph-based Semi-supervised Learning

Proceedings of the 2012 SIAM International Conference on Data Mining, 2012

We develop a generalized optimization framework for graph-based semi-supervised learning. The framework gives as particular cases the Standard Laplacian, Normalized Laplacian and PageRank based methods. We have also provided new probabilistic interpretation based on random walks and characterized the limiting behaviour of the methods. The random walk based interpretation allows us to explain differences between the performances of methods with different smoothing kernels. It appears that the PageRank based method is robust with respect to the choice of the regularization parameter and the labelled data. We illustrate our theoretical results with two realistic datasets, characterizing different challenges: Les Miserables characters social network and Wikipedia hyper-link graph. The graph-based semi-supervised learning classifies the Wikipedia articles with very good precision and perfect recall employing only the information about the hyper-text links.

Semi-supervised learning with regularized Laplacian

Optimization Methods and Software

We study a semi-supervised learning method based on the similarity graph and Regularized Laplacian. We give convenient optimization formulation of the Regularized Laplacian method and establish its various properties. In particular, we show that the kernel of the method can be interpreted in terms of discrete and continuous time random walks and possesses several important properties of proximity measures. Both optimization and linear algebra methods can be used for efficient computation of the classification functions. We demonstrate on numerical examples that the Regularized Laplacian method is competitive with respect to the other state of the art semi-supervised learning methods.