validate_data (original) (raw)
sklearn.utils.validation.validate_data(_estimator, /, X='no_validation', y='no_validation', reset=True, validate_separately=False, skip_check_array=False, **check_params)[source]#
Validate input data and set or check feature names and counts of the input.
This helper function should be used in an estimator that requires input validation. This mutates the estimator and sets the n_features_in_
andfeature_names_in_
attributes if reset=True
.
Added in version 1.6.
Parameters:
_estimatorestimator instance
The estimator to validate the input for.
X{array-like, sparse matrix, dataframe} of shape (n_samples, n_features), default=’no validation’
The input samples. If 'no_validation'
, no validation is performed on X
. This is useful for meta-estimator which can delegate input validation to their underlying estimator(s). In that case y
must be passed and the only accepted check_params
are multi_output
andy_numeric
.
yarray-like of shape (n_samples,), default=’no_validation’
The targets.
- If
None
, check_array is called onX
. If the estimator’srequires_y
tag is True, then an error will be raised. - If
'no_validation'
, check_array is called onX
and the estimator’srequires_y
tag is ignored. This is a default placeholder and is never meant to be explicitly set. In that caseX
must be passed. - Otherwise, only
y
with_check_y
or bothX
andy
are checked with either check_array orcheck_X_y depending onvalidate_separately
.
resetbool, default=True
Whether to reset the n_features_in_
attribute. If False, the input will be checked for consistency with data provided when reset was last True.
Note
It is recommended to call reset=True
in fit
and in the first call to partial_fit
. All other methods that validate X
should set reset=False
.
validate_separatelyFalse or tuple of dicts, default=False
Only used if y
is not None
. If False
, call check_X_y. Else, it must be a tuple of kwargs to be used for calling check_array on X
and y
respectively.
estimator=self
is automatically added to these dicts to generate more informative error message in case of invalid input data.
skip_check_arraybool, default=False
If True
, X
and y
are unchanged and only feature_names_in_
andn_features_in_
are checked. Otherwise, check_arrayis called on X
and y
.
**check_paramskwargs
Parameters passed to check_array orcheck_X_y. Ignored if validate_separately is not False.
estimator=self
is automatically added to these params to generate more informative error message in case of invalid input data.
Returns:
out{ndarray, sparse matrix} or tuple of these
The validated input. A tuple is returned if both X
and y
are validated.