BUG(?): ArrowExtensionArray.median is an "approximate median" · Issue #52679 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Description
ArrowExtensionArray.median
is implemented using pyarrow.compute.approximate_median
which might produce surprising results for users looking for a "true" median:
In [1]: import pandas as pd
In [2]: pd.Series([1, 2], dtype="float64").median()
Out[2]: 1.5
In [3]: pd.Series([1, 2], dtype="float64[pyarrow]").median()
Out[3]: 1.0
It does not look like pyarrow provides a compute function for a "true" median. Is this intentional? Should it be documented somehow?
cc @mroeschke