Gaussian Processes in Machine Learning (original) (raw)
Last Updated : 15 Jun, 2026
Gaussian Processes (GPs) are probabilistic machine learning models used for regression and classification tasks. Instead of predicting a single value, they provide both predictions and a measure of uncertainty, making them useful for problems where confidence in predictions is important.
- Non parametric learning approach.
- Provides predictions with uncertainty estimates.
- Effective for modeling complex and nonlinear relationships.
- Works well with small and medium sized datasets.
Key Concepts of Gaussian Processes
Gaussian Processes (GPs) are defined using a mean function m(x) and a covariance function (kernel) k(x,x′)
f(x)\sim GP(m(x),k(x,x′))
Where:
- m(x) : Mean function
- k(x,x′) : Covariance function (Kernel)
**1. Kernels
Kernels, also called covariance functions or similarity functions, measure the similarity between input points and help Gaussian Processes learn patterns from data.
- Capture relationships between data points.
- Model both linear and non-linear patterns.
- Help in making predictions for new data.
- Incorporate prior assumptions about the data.
**Common Kernels:
- **Linear Kernel: Captures linear relationships.
- **RBF (Gaussian) Kernel: Measures similarity based on distance and is widely used for smooth functions.
k_{RBF}(x,x') = exp(-\frac{||x-x'||}{2 l^2})
2. Prior Distribution
The prior distribution represents the initial assumptions about a function before any data is observed. It serves as the starting point of a Gaussian Process and is defined using the mean function and kernel (covariance function).
- Represents beliefs before observing data.
- Usually follows a Gaussian (normal) distribution.
- Defined by the mean and covariance functions.
- Acts as the foundation for learning from data.
- Updates to the posterior distribution after observing data.
**Formula:
f(x) \sim N(m(x),k(x,x'))
**where:
- m(x) : Mean function
- k(x,x') : Covariance (kernel) function
3.Posterior Distribution
The posterior distribution, represents the updated belief about a function after observing data. It is obtained by combining the prior distribution with the observed data using Bayes' theorem.
- Updates beliefs after observing data.
- Combines prior knowledge with observed evidence.
- Provides predictions along with uncertainty estimates.
- Becomes more accurate as more data is available.
- Remains Gaussian in Gaussian Processes.
**Formula:
p(f_*|X, y, X_*) = N(\mu_*, \Sigma_*)
**Where:
- X : Training input data
- y : Training output (target) data
- X_* : Test or new input data
- \mu_* : Posterior mean
- \Sigma_* : Posterior covariance
4.Combining Kernels
Combining kernels allows Gaussian Processes to capture multiple patterns and relationships in the data. By adding or multiplying different kernels, the model becomes more flexible and can represent complex data structures more effectively.
- Improves the flexibility and expressiveness of the model.
- Captures different patterns using multiple kernels.
- Helps model complex relationships in data.
- Kernels can be combined through addition or multiplication.
**Kernel Addition:
k_{combined}(x,x') = k_1(x,x') + k_2(x,x')
5. Gaussian Process in Classification and Regression
Gaussian Processes can be applied to both regression and classification problems. In regression, they predict continuous values, while in classification, they predict discrete class labels.
**Regression:
- Predicts continuous outcomes.
- Provides a predictive distribution for new inputs.
**Classification:
- Predicts discrete class labels.
- Uses a non-linear function (e.g., logistic/sigmoid) to convert outputs into class probabilities.
- Often requires approximation methods due to non-Gaussian likelihoods.
Implementation of Gaussian Processes
Step 1: Import Required Libraries
- **fetch_california_housing : Loads the California Housing dataset.
- **GaussianProcessRegressor : Implements Gaussian Process Regression.
- **RBF and ConstantKernel : Used to define the kernel function.
- **train_test_split : Splits the dataset into training and testing sets.
- **mean_squared_error : Evaluates model performance. Python `
from sklearn.datasets import fetch_california_housing from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error
`
Step 2: Load the Dataset
- X contains the input features such as income, house age and population.
- y contains the target variable (house prices). Python `
data = fetch_california_housing() X, y = data.data, data.target
`
Step 3: Select a Subset of Data
- Only the first 2000 samples are selected.
- This makes training faster and prevents memory issues. Python `
subset_size = 2000 X_subset = X[:subset_size] y_subset = y[:subset_size]
`
Step 4: Split Data into Training and Testing Sets
Python `
X_train, X_test, y_train, y_test = train_test_split( X_subset, y_subset, test_size=0.3, random_state=42)
`
Step 5: Define the Kernel Function
- **Constant Kernel (C): Controls the overall variance of the model.
- **RBF Kernel: Captures smooth and non-linear relationships in the data. Python `
kernel = C(1.0, (1e-3, 1e3)) * RBF(length_scale=1.0)
`
Step 6: Create the Gaussian Process Regressor
- **kernel=kernel : Uses the defined kernel function.
- **n_restarts_optimizer=10 : Optimizes kernel parameters 10 times to find a better solution.
- **random_state=42 : Ensures reproducible results. Python `
gp = GaussianProcessRegressor( kernel=kernel, n_restarts_optimizer=10, random_state=42)
`
Step 7: Train the Model
- Kernel parameters are optimized.
- Relationships between input features and house prices are learned. Python `
gp.fit(X_train, y_train)
`
Step 8: Make Predictions
- The trained model predicts house prices for the test dataset.
- The predicted values are stored in y_pred. Python `
y_pred = gp.predict(X_test)
`
Step 9: Evaluate Model Performance
Python `
mse = mean_squared_error(y_test, y_pred) print(f'Mean Squared Error: {mse}')
`
**Output:
Mean Squared Error: 1.5693
Download full code from here