Scalable High-Order Gaussian Process Regression (original) (raw)

Fast near-GRID Gaussian process regression

Gaussian process regression (GPR) is a powerful non-linear technique for Bayesian inference and prediction. One drawback is its O(N 3 ) computational complexity for both prediction and hyperparameter estimation for N input points which has led to much work in sparse GPR methods. In case that the covariance function is expressible as a tensor product kernel (TPK) and the inputs form a multidimensional grid, it was shown that the costs for exact GPR can be reduced to a sub-quadratic function of N . We extend these exact fast algorithms to sparse GPR and remark on a connection to Gaussian process latent variable models (GPLVMs). In practice, the inputs may also violate the multidimensional grid constraints so we pose and efficiently solve missing and extra data problems for both exact and sparse grid GPR. We demonstrate our method on synthetic, text scan, and magnetic resonance imaging (MRI) data reconstructions.

DinTucker: Scaling Up Gaussian Process Models on Large Multidimensional Arrays

Proceedings of the AAAI Conference on Artificial Intelligence

Tensor decomposition methods are effective tools for modelling multidimensional array data (i.e., tensors). Among them, nonparametric Bayesian models, such as Infinite Tucker Decomposition (InfTucker), are more powerful than multilinear factorization approaches, including Tucker and PARAFAC, and usually achieve better predictive performance. However, they are difficult to handle massive data due to a prohibitively high training cost. To address this limitation, we propose Distributed infinite Tucker (DinTucker), a new hierarchical Bayesian model that enables local learning of InfTucker on subarrays and global information integration from local results. We further develop a distributed stochastic gradient descent algorithm, coupled with variational inference for model estimation. In addition, the connection between DinTucker and InfTucker is revealed in terms of model evidence. Experiments demonstrate that DinTucker maintains the predictive accuracy of InfTucker and is scalable on ma...

Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for representing multidimensional functions with machine-learned lower-dimensional terms allowing insight with a general method

Computer Physics Communications, 2022

We present an implementation for the RS-HDMR-GPR (Random Sampling High Dimensional Model Representation Gaussian Process Regression) method. The method builds representations of multivariate functions with lower-dimensional terms, either as an expansion over orders of coupling or using terms of only a given dimensionality. This facilitates, in particular, recovering functional dependence from sparse data. The method also allows for imputation of missing values of the variables and for a significant pruning of the useful number of HDMR terms. It can also be used for estimating relative importance of different combinations of input variables, thereby adding an element of insight to a general machine learning method, in a way that can be viewed as extending the automatic relevance determination approach. The capabilities of the method and of the associated Python software tool are demonstrated on test cases involving synthetic analytic

Online Sparse Multi-Output Gaussian Process Regression and Learning

IEEE Transactions on Signal and Information Processing over Networks

This paper proposes an approach for online training of a sparse multi-output Gaussian process (GP) model using sequentially obtained data. The considered model combines linearly multiple latent sparse GPs to produce correlated output variables. Each latent GP has its own set of inducing points to achieve sparsity. We show that given the model hyperparameters, the posterior over the inducing points is Gaussian under Gaussian noise since they are linearly related to the model outputs. However, the inducing points from different latent GPs would become correlated, leading to a full covariance matrix cumbersome to handle. Variational inference is thus applied and an approximate regression technique is obtained, with which the posteriors over different inducing point sets can always factorize. As the model outputs are non-linearly dependent on the hyperparameters, a novel marginalized particle filer (MPF)-based algorithm is proposed for the online inference of the inducing point values and hyperparameters. The approximate regression technique is incorporated in the MPF and its distributed realization is presented. Algorithm validation using synthetic and real data is conducted, and promising results are obtained.

Correlated Product of Experts for Sparse Gaussian Process Regression

Cornell University - arXiv, 2021

Gaussian processes (GPs) are an important tool in machine learning and statistics with applications ranging from social and natural science through engineering. They constitute a powerful kernelized non-parametric method with well-calibrated uncertainty estimates, however, off-the-shelf GP inference procedures are limited to datasets with several thousand data points because of their cubic computational complexity. For this reason, many sparse GPs techniques have been developed over the past years. In this paper, we focus on GP regression tasks and propose a new approach based on aggregating predictions from several local and correlated experts. Thereby, the degree of correlation between the experts can vary between independent up to fully correlated experts. The individual predictions of the experts are aggregated taking into account their correlation resulting in consistent uncertainty estimates. Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases. The presented framework can deal with a general kernel function and multiple variables, and has a time and space complexity which is linear in the number of experts and data samples, which makes our approach highly scalable. We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against stateof-the-art GP approximation methods for synthetic as well as several real-world datasets with deterministic and stochastic optimization.

Multiple output gaussian process regression

2005

Gaussian processes are usually parameterised in terms of their covariance functions. However, this makes it difficult to deal with multiple outputs, because ensuring that the covariance matrix is positive definite is problematic. An alternative formulation is to treat Gaussian processes as white noise sources convolved with smoothing kernels, and to parameterise the kernel instead. Using this, we extend Gaussian processes to handle multiple, coupled outputs. 1

Hierarchical Mixture-of-Experts Model for Large-Scale Gaussian Process Regression

We propose a practical and scalable Gaussian process model for large-scale nonlinear probabilistic regression. Our mixture-of-experts model is conceptually simple and hierarchically recombines computations for an overall approximation of a full Gaussian process. Closed-form and distributed computations allow for efficient and massive parallelisation while keeping the memory consumption small. Given sufficient computing resources, our model can handle arbitrarily large data sets, without explicit sparse approximations. We provide strong experimental evidence that our model can be applied to large data sets of sizes far beyond millions. Hence, our model has the potential to lay the foundation for general large-scale Gaussian process research.

Efficient multioutput Gaussian processes through variational inducing kernels

2011

Interest in multioutput kernel methods is increasing, whether under the guise of multitask learning, multisensor networks or structured output data. From the Gaussian process perspective a multioutput Mercer kernel is a covariance function over correlated output functions. One way of constructing such kernels is based on convolution processes (CP). A key problem for this approach is efficient inference.Álvarez and Lawrence recently presented a sparse approximation for CPs that enabled efficient inference. In this paper, we extend this work in two directions: we introduce the concept of variational inducing functions to handle potential non-smooth functions involved in the kernel CP construction and we consider an alternative approach to approximate inference based on variational methods, extending the work by Titsias to the multiple output case. We demonstrate our approaches on prediction of school marks, compiler performance and financial time series.

Hierarchical gaussian process regression

2010

We address an approximation method for Gaussian process (GP) regression, where we approximate covariance by a block matrix such that diagonal blocks are calculated exactly while off-diagonal blocks are approximated. Partitioning input data points, we present a two-layer hierarchical model for GP regression, where prototypes of clusters in the upper layer are involved for coarse modeling by a GP and data points in each cluster in the lower layer are involved for fine modeling by an individual GP whose prior mean is given by the corresponding prototype and covariance is parameterized by data points in the partition. In this hierarchical model, integrating out latent variables in the upper layer leads to a block covariance matrix, where diagonal blocks contain similarities between data points in the same partition and off-diagonal blocks consist of approximate similarities calculated using prototypes. This particular structure of the covariance matrix divides the full GP into a pieces of manageable sub-problems whose complexity scales with the number of data points in a partition. In addition, our hierarchical GP regression (HGPR) is also useful for cases where partitions of data reveal different characteristics. Experiments on several benchmark datasets confirm the useful behavior of our method.

Approximate Inference in Related Multi-output Gaussian Process Regression

Lecture Notes in Computer Science, 2017

In Gaussian Processes a multi-output kernel is a covariance function over correlated outputs. Using a prior known relation between outputs, joint auto-and cross-covariance functions can be constructed. Realizations from these joint-covariance functions give outputs that are consistent with the prior relation. One issue with gaussian process regression is efficient inference when scaling upto large datasets. In this paper we use approximate inference techniques upon multi-output kernels enforcing relationships between outputs. Results of the proposed methodology for theoretical data and real world applications are presented. The main contribution of this paper is the application and validation of our methodology on a dataset of real aircraft flight tests, while imposing knowledge of aircraft physics into the model.