read_spss doens't return the metadata · Issue #54264 · pandas-dev/pandas (original) (raw)
Feature Type
- Adding new functionality to pandas
- Changing existing functionality in pandas
- Removing existing functionality in pandas
Problem Description
`def read_spss(
path: str | Path,
usecols: Sequence[str] | None = None,
convert_categoricals: bool = True,
) -> DataFrame:
"""
Load an SPSS file from the file path, returning a DataFrame.
.. versionadded:: 0.25.0
Parameters
----------
path : str or Path
File path.
usecols : list-like, optional
Return a subset of the columns. If None, return all columns.
convert_categoricals : bool, default is True
Convert categorical columns into pd.Categorical.
Returns
-------
DataFrame
"""
pyreadstat = import_optional_dependency("pyreadstat")
if usecols is not None:
if not is_list_like(usecols):
raise TypeError("usecols must be list-like.")
else:
usecols = list(usecols) # pyreadstat requires a list
df, _ = pyreadstat.read_sav(
stringify_path(path), usecols=usecols, apply_value_formats=convert_categoricals
)
return df`
pandas has this function, which uses pyreadstat to read 'sav' files and returns a dataset back, but for some reason it ignores the metadata, and I guess some may use this, otherwise why using spss at first place
Feature Description
Return the metadata along with the dataframe
Alternative Solutions
`def read_spss(
path: str | Path,
usecols: Sequence[str] | None = None,
convert_categoricals: bool = True,
) -> DataFrame:
"""
Load an SPSS file from the file path, returning a DataFrame.
.. versionadded:: 0.25.0
Parameters
----------
path : str or Path
File path.
usecols : list-like, optional
Return a subset of the columns. If None, return all columns.
convert_categoricals : bool, default is True
Convert categorical columns into pd.Categorical.
Returns
-------
DataFrame
"""
pyreadstat = import_optional_dependency("pyreadstat")
if usecols is not None:
if not is_list_like(usecols):
raise TypeError("usecols must be list-like.")
else:
usecols = list(usecols) # pyreadstat requires a list
df, metadata = pyreadstat.read_sav(
stringify_path(path), usecols=usecols, apply_value_formats=convert_categoricals
)
return df, metadata`
Additional Context
No response