Comparing Randomized Search and Grid Search for Hyperparameter Estimation in Scikit Learn (original) (raw)
Last Updated : 9 May, 2026
Hyperparameters are configuration settings defined before training that control how a model learns and performs. Choosing the right values is essential for achieving good performance and generalization.
- **Role in Model Performance: They influence how a model learns patterns (e.g., learning rate, tree depth), directly affecting accuracy and overfitting
- **Need for Optimization: Since these values are not learned automatically, techniques like grid search and randomized search are used to find the best combination
- **Trade-off in Methods: Grid search explores all possibilities for accuracy, while randomized search is faster and more efficient for large search spaces
Grid Search
Grid search is a hyperparameter optimization technique that evaluates all possible combinations of given parameter values.
- **Exhaustive Search: Tests every possible combination of hyperparameters from a predefined grid.
- **Deterministic Approach: Ensures consistent and repeatable results for the same parameter grid.
- **Cross-Validation Based: Evaluates each combination using cross-validation for reliable performance.
- **Computationally Expensive: Becomes slow as the number of parameters or values increases.
Randomized Search
Randomized search is a hyperparameter optimization technique that samples values from given distributions instead of testing all combinations.
- **Sampling-Based Approach: Randomly selects values for each hyperparameter from specified distributions.
- **Faster and Efficient: Reduces computation time, especially for large search spaces.
- **Iterative Evaluation: Runs for a fixed number of iterations and selects the best-performing parameters.
- **Scalable: Works well when parameters have a wide or continuous range.
Implementation
Importing librabries and Sample dataset generation
Import the necessary libraries for dataset generation, model creation and hyperparameter tuning. A synthetic classification dataset is created using make_classification() for demonstration purposes.
Python `
import numpy as np from sklearn.datasets import make_classification from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import GridSearchCV, RandomizedSearchCV from scipy.stats import randint
X, y = make_classification(n_samples=200,n_features=10, n_classes=2,random_state=42) model = RandomForestClassifier(random_state=42)
`
Define Hyperparameter Search Space
Define the range of hyperparameters to be explored. Grid Search uses fixed parameter values, while Randomized Search samples random values from defined distributions.
Python `
param_grid = { 'n_estimators': [50, 100, 200],'max_depth': [None, 5, 10],'min_samples_split': [2, 5, 10], 'bootstrap': [True, False]}
param_dist = { 'n_estimators': randint(50, 200),'max_depth': [None, 5, 10],'min_samples_split': randint(2, 10),'bootstrap': [True, False]}
`
Perform Grid & Random Search Optimization
Both methods train multiple models using 5-fold cross-validation to identify the best hyperparameter combination based on model performance.
Python `
grid_search = GridSearchCV( estimator=model, param_grid=param_grid, cv=5, n_jobs=-1 ) grid_search.fit(X, y)
random_search = RandomizedSearchCV( estimator=model, param_distributions=param_dist, n_iter=10, cv=5, random_state=42, n_jobs=-1 ) random_search.fit(X, y)
print("Grid Search Best Params:") print(grid_search.best_params_) print("Score:", grid_search.best_score_)
print("\nRandomized Search Best Params:") print(random_search.best_params_) print("Score:", random_search.best_score_)
`
**Output:
Grid Search Best Params: {'bootstrap': True, 'max_depth': None, 'min_samples_split': 10, 'n_estimators': 50} Score: 0.8800000000000001
Randomized Search Best Params: {'bootstrap': True, 'max_depth': None, 'min_samples_split': 5, 'n_estimators': 153} Score: 0.8699999999999999
Grid Search achieved a slightly higher score by exhaustively evaluating all possible parameter combinations, while Randomized Search produced near-optimal results with fewer evaluations and lower computational cost. This demonstrates that Randomized Search is more efficient for large search spaces, whereas Grid Search is more suitable for precise hyperparameter tuning.
You can download the source code from here.
Advantages of RandomizedSearchCV over GridSearchCV
RandomizedSearchCV is often preferred when the search space is large or continuous, as it samples only a subset of possible combinations instead of trying all of them.
- **More Efficient: Faster than grid search since it does not evaluate every combination, making it suitable for large or complex search spaces
- **Handles Large/Continuous Spaces Well: Works better when hyperparameters have many possible or continuous values
- **Reduces Overfitting Risk: Random sampling avoids over-tuning on training data compared to exhaustive search
- **Scalable: Can be controlled using n_iter, making it flexible based on available resources.