Allow IntervalIndex constructor closed parameter to override inferred closed · Issue #19370 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

The closed parameter of IntervalIndex is currently a bit inconsistent, and it's behavior is input dependent. By my count, there are 3 different behaviors:

1. closed parameter overrides the inferred closed when the input data is empty or purely NA

In [2]: pd.IntervalIndex([], closed='both') Out[2]: IntervalIndex([] closed='both', dtype='interval[int64]')

In [3]: pd.IntervalIndex([np.nan, np.nan], closed='neither') Out[3]: IntervalIndex([nan, nan] closed='neither', dtype='interval[float64]')

2. closed is ignored when the input data is an IntervalIndex:

In [4]: ii = pd.interval_range(0, 3, closed='both')

In [5]: ii Out[5]: IntervalIndex([[0, 1], [1, 2], [2, 3]] closed='both', dtype='interval[int64]')

In [6]: pd.IntervalIndex(ii, closed='neither') Out[6]: IntervalIndex([[0, 1], [1, 2], [2, 3]] closed='both', dtype='interval[int64]')

3. closed raises a ValueError when a list-like of Interval objects with a conflicting value for closed is passed as input data:

In [7]: ivs = [pd.Interval(0, 1, closed='both'), pd.Interval(1, 2, closed='both')]

In [8]: pd.IntervalIndex(ivs, closed='neither')

ValueError: conflicting values for closed: constructor got 'neither', inferred from data 'both'

xref #18421 where I implemented this behavior; I'm no longer sure it's the best approach to take.

Problem description

This behavior is inconsistent, and could lead to confusion. As it's currently implemented, the closed parameter is useless in the majority of cases, either doing nothing or raising an error.

Expected Output

I'd expect there to be consistency as to how the closed parameter is handled. My preference would be to have it override the inferred closed if explicitly passed by the user. Note that this is consistent with how the dtype parameter is used in constructors throughout the codebase.

xref #19371 (may render this issue moot)