pandas.DataFrame.select_dtypes — pandas 3.0.0.dev0+2095.g2e141aaf99 documentation (original) (raw)
DataFrame.select_dtypes(include=None, exclude=None)[source]#
Return a subset of the DataFrame’s columns based on the column dtypes.
This method allows for filtering columns based on their data types. It is useful when working with heterogeneous DataFrames where operations need to be performed on a specific subset of data types.
Parameters:
include, excludescalar or list-like
A selection of dtypes or strings to be included/excluded. At least one of these parameters must be supplied.
Returns:
DataFrame
The subset of the frame including the dtypes in include
and excluding the dtypes in exclude
.
Raises:
ValueError
- If both of
include
andexclude
are empty - If
include
andexclude
have overlapping elements
TypeError
- If any kind of string dtype is passed in.
Notes
- To select all numeric types, use
np.number
or'number'
- To select strings you must use the
object
dtype, but note that this will return all object dtype columns. Withpd.options.future.infer_string
enabled, using"str"
will work to select all string columns. - See the numpy dtype hierarchy
- To select datetimes, use
np.datetime64
,'datetime'
or'datetime64'
- To select timedeltas, use
np.timedelta64
,'timedelta'
or'timedelta64'
- To select Pandas categorical dtypes, use
'category'
- To select Pandas datetimetz dtypes, use
'datetimetz'
or'datetime64[ns, tz]'
Examples
df = pd.DataFrame( ... {"a": [1, 2] * 3, "b": [True, False] * 3, "c": [1.0, 2.0] * 3} ... ) df a b c 0 1 True 1.0 1 2 False 2.0 2 1 True 1.0 3 2 False 2.0 4 1 True 1.0 5 2 False 2.0
df.select_dtypes(include="bool") b 0 True 1 False 2 True 3 False 4 True 5 False
df.select_dtypes(include=["float64"]) c 0 1.0 1 2.0 2 1.0 3 2.0 4 1.0 5 2.0
df.select_dtypes(exclude=["int64"]) b c 0 True 1.0 1 False 2.0 2 True 1.0 3 False 2.0 4 True 1.0 5 False 2.0