API: read_stata with index_col=None should return RangeIndex by topper-123 · Pull Request #49745 · pandas-dev/pandas (original) (raw)
If read_stata
was used with parameter index=None
, an index based on np.arange
was supplied to the constructed DataFrame, i.e. (pre pandas 2.0) an Int64Index
.
np.arange
has dtype np.int_
, i.e. like np.intp
, except is always 32bit on windows, which makes it annoying to use with tests when indexes can take all numpy numeric dtypes (like after #49560), so I'm looking into how arange
is used in #49560. One case I found it was used is in read_stata
and in that case it's better to use a range
, so we get a RangeIndex
instead of an Index[int_]
when using read_stata(index_col=None)
.
This is a slight change in API, so I separate it out into its own PR here, so #49560, which is a large Pr, can be as focused as possible.