Robust Latent Factor Analysis for Precise Representation of High-Dimensional and Sparse Data (original) (raw)
Related papers
A novel latent factor model for recommender system
Journal of Information Systems and Technology Management, 2016
Matrix factorization (MF) has evolved as one of the better practice to handle sparse data in field of recommender systems. Funk singular value decomposition (SVD) is a variant of MF that exists as state-of-the-art method that enabled winning the Netflix prize competition. The method is widely used with modifications in present day research in field of recommender systems. With the potential of data points to grow at very high velocity, it is prudent to devise newer methods that can handle such data accurately as well as efficiently than Funk-SVD in the context of recommender system. In view of the growing data points, I propose a latent factor model that caters to both accuracy and efficiency by reducing the number of latent features of either users or items making it less complex than Funk-SVD, where latent features of both users and items are equal and often larger. A comprehensive empirical evaluation of accuracy on two publicly available, amazon and ml-100 k datasets reveals the comparable accuracy and lesser complexity of proposed methods than Funk-SVD.
Randomized Latent Factor Model for High-dimensional and Sparse Matrices from Industrial Applications
IEEE/CAA Journal of Automatica Sinica, 2019
Latent factor (LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse (HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers, which may consume many iterations to achieve a local optima, resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor (RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly. Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data. I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models.
Elastic-net regularized latent factor analysis-based models for recommender systems
Neurocomputing, 2018
Latent factor analysis (LFA)-based models are highly efficient in recommender systems. The problem of LFA is defined on high-dimensional and sparse (HiDS) matrices corresponding to relationships among numerous entities in industrial applications. It is ill-posed without a unique and optimal solution, making regularization vital in improving the generality of an LFA-based model. Current models mostly adopt l 2-norm-based regularization, which cannot regularize the latent factor distributions. For addressing this issue, this work applies the elastic-net-based regularization to an LFA-based model, thereby achieving an elastic-net regularized latent factor analysis-based (ERLFA) model. We further adopt two efficient learning algorithms, i.e., forward-looking sub-gradients and forward-backward splitting and stochastic proximal gradient descent, to train desired latent factors in an ERLFA-based model, resulting in two novel ERLFA-based models relying on different learning schemes. Experimental results on four large industrial datasets show that by regularizing the latent factor distribution, the proposed ERLFA-based models are able to achieve high prediction accuracy for missing data of an HiDS matrix without additional computational burden.
Latent Factor Model For Collaborative Filtering
IRJET, 2022
The restrictions of neighborhood-based Collaborative Filtering (CF) methods including scalability and inadequate information present impediments to efficient recommendation systems. These strategies result in less precision, accuracy and consume a huge amount of time in recommending items. Model-based matrix factorization is an effective approach used to overcome the previously mentioned limitations of CF. In this paper, we are going to discuss a matrix factorization technique called singular value decomposition, which would help us model our recommendation system and result in good performance.
Comparing the staples in latent factor models for recommender systems
Proceedings of the 29th Annual ACM Symposium on Applied Computing, 2014
Since the Netflix Prize competition, latent factor models (LFMs) have become the comparison "staples" for many of the recent recommender methods. The performance improvement of LFMs over baseline approaches, however, hovers at only low percentage numbers. Therefore, it is time for a better understanding of their real power beyond the overall RMSE (root-mean-square error), which as it happens, lies at a very compressed range, without providing too much chance for deeper insight. This paper provides a detailed experimental study regarding the performance of classical staple LFMs on a classical dataset, Movielens 1M 1 , that sheds light on a much more pronounced excellence of LFMs for particular categories of users and items, for RMSE and other measures. In particular, LFMs exhibit surprising and excellent advantages when handling several difficult user and item categories. By comparing the distributions of the test and predicted ratings, we show that the performance of LFMs is influenced by the rating distribution. We then propose a method to estimate the performance of LFMs for a given rating dataset. Also, we provide a very simple, open-source, library that implements staple LFMs achieving a similar performance as some very recent (2013) developments in LFMs, and at the same time being more transparent than some other libraries in wide use.
Large-scale recommender system with compact latent factor model
Expert Systems With Applications, 2016
This work devises a factorization model called compact latent factor model, in which we propose a compact representation to consider query, user and item in the model. The blend of information retrieval and collaborative filtering is a typical setting in many applications. The proposed model can incorporate various features into the model, and this work demonstrates that the proposed model can incorporate context-aware and content-based features to handle context-aware recommendation and cold-start problems, respectively. Besides recommendation accuracy, a critical problem concerning the computational cost emerges in practical situations. To tackle this problem, this work uses a buffer update scheme to allow the proposed model to process data incrementally, and provide a means to use historical data instances. Meanwhile, we use stochastic gradient descent algorithm along with sampling technique to optimize ranking loss, giving a competitive performance while considering scalability and deployment issues. The experimental results indicate that the proposed algorithm outperforms other alternatives on four datasets.
Cosine Based Latent Factor Model for Precision Oriented Recommendation
International Journal of Advanced Computer Science and Applications, 2016
Recommender systems suggest a list of interesting items to users based on their prior purchase or browsing behaviour on e-commerce platforms. The continuing research in recommender systems have primarily focused on developing algorithms for rating prediction task. However, most e-commerce platforms provide 'top-k' list of interesting items for every user. In line with this idea, the paper proposes a novel machine learning algorithm to predict a list of 'top-k' items by optimizing the latent factors of users and items with the mapped scores from ratings. The basic idea is to learn latent factors based on the cosine similarity between the users and items latent features which is then used to predict the scores for unseen items for every user. Comprehensive empirical evaluations on publicly available benchmark datasets reveal that the proposed model outperforms the state-of-the-art algorithms in recommending good items to a user.
A Coordinate Descent Method for Robust Matrix Factorization and Applications
SIAM Undergraduate Research Online, 2016
Matrix factorization methods are widely used for extracting latent factors for low rank matrix completion and rating prediction problems arising in recommender systems of on-line retailers. Most of the existing models are based on L2 fidelity (quadratic functions of factorization error). In this work, a coordinate descent (CD) method is developed for matrix factorization under L1 fidelity so that the related minimization is done one variable at a time and the factorization error is sparsely distributed. In low rank random matrix completion and rating prediction of MovieLens-100k datasets, the CDL1 method shows remarkable stability and accuracy under gross corruption of training (observation) data while the L2 fidelity based methods rapidly deteriorate. A closed form analytical solution is found for the one-dimensional L1-fidelity subproblem, and is used as a building block of CDL1 algorithm whose convergence is analyzed. The connection with the well-known convex method, the robust principal component analysis (RPCA), is made. A comparison with RPCA on recovering low rank Gaussian matrices under sparse and independent Gaussian noise shows that CDL1 maintains accuracy at much lower sampling ratios (from much fewer observed entries) than that for RPCA.
2010
Recommender systems apply machine learning and data mining techniques for filtering unseen information and can predict whether a user would like a given item. The main types of recommender systems namely collaborative filtering and content-based filtering suffer from scalability, data sparsity, and cold-start problems resulting in poor quality recommendations and reduced coverage. There has been some work in the literature to increase the scalability by reducing the dimensions of the recommender system dataset using singular value decomposition (SVD); however, due to sparsity it results in inaccurate recommendations. In this paper, we show how a careful selection of an imputation source in singular value decomposition based recommender system can provide potential benefits ranging from cost saving, to performance enhancement. The proposed missing value imputation methods have the ability to exploit any underlying data correlation structures and hence have been proven to exhibit much superior accuracy and performance as compared to the traditional missing value imputation strategy-item average of the user-item rating matrix-that has been the preferred approach in the literature to resolve this problem. By extensive experimental results on three different dataset, we show that the proposed approaches outperform traditional one and moreover, they provide better recommendation under new user cold-start problem, new item cold-start problem, long tail problem, and sparse conditions. Povzetek: Opisani so priporočilni sistemi, tj. sistemi, ki filtrirajo informacije s pomočjo metod strojnega učenja.
Robust Recommendation via Social Network Enhanced Matrix Completion
Statistica Sinica, 2023
Robust product recommendation is crucial for internet platforms to boost their businesses. One challenge though is that the user-product rating matrix often has many missing entries. Social network information generates new insights about user behaviors. To fully utilize the social network information, we develop a novel approach, namely MCNet, which combines the random dot product graph model and the low-rank matrix completion to recover the missing entries in the user-product rating matrix from the internet platform. Our algorithm improves the accuracy and the efficiency of recovering the incomplete matrices. We study the asymptotic properties of the estimator. Furthermore, we perform extensive simulations and show that MCNet outperforms the existing approaches, especially when data have small signals. Moreover, MCNet yields robust estimation under misspecified models. We apply MCNet and the competitors to predict the missing entries in the user-product rating matrices on the Yelp and Douban movie platforms. MCNet generally gives the smallest testing errors among all the comparative methods.