COMPAT: clarify Index integer conversions when dtype is specified in construction · Issue #15187 · pandas-dev/pandas (original) (raw)
so [8], and [9] are our current model, IOW, we make an effort to convert to the specified type, but will coerce to an available type if the data is not valid for that dtype.
ideally we would also be consistent w.r.t. #15145, IOW Series construction with a specified dtype (not that we upcast to the available Index types but don't do this for Series, e.g. [18])
So we should be consistent on this.
In [8]: Index([np.nan],dtype='int64')
Out[8]: Float64Index([nan], dtype='float64')
In [9]: Index([np.nan],dtype='uint64')
Out[9]: Float64Index([nan], dtype='float64')
In [10]: Index([np.iinfo(np.int64).max-1],dtype='int64')
Out[10]: Int64Index([9223372036854775806], dtype='int64')
In [11]: Index([np.iinfo(np.int64).max-1],dtype='uint64')
Out[11]: UInt64Index([9223372036854775806], dtype='uint64')
# I guess this should convert to Float64Index?
In [12]: Index([np.iinfo(np.uint64).max-1],dtype='int64')
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
<ipython-input-12-f2ec7d3c38a4> in <module>()
----> 1 Index([np.iinfo(np.uint64).max-1],dtype='int64')
/Users/jreback/pandas/pandas/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
318 # other iterable of some kind
319 subarr = _asarray_tuplesafe(data, dtype=object)
--> 320 return Index(subarr, dtype=dtype, copy=copy, name=name, **kwargs)
321
322 """
/Users/jreback/pandas/pandas/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
199 inferred = lib.infer_dtype(data)
200 if inferred == 'integer':
--> 201 data = np.array(data, copy=copy, dtype=dtype)
202 elif inferred in ['floating', 'mixed-integer-float']:
203
OverflowError: Python int too large to convert to C long
In [13]: Index([np.iinfo(np.uint64).max-1],dtype='uint64')
Out[13]: UInt64Index([18446744073709551614], dtype='uint64')
In [14]: Index([-1], dtype='int64')
Out[14]: Int64Index([-1], dtype='int64')
# this looks wrong
In [15]: Index([-1], dtype='uint64')
Out[15]: UInt64Index([18446744073709551615], dtype='uint64')
# we do this type of same-dtype upcasting already (this is correct / good thing)
In [18]: Index([np.iinfo(np.int32).max+1], dtype='int64')
Out[18]: Int64Index([2147483648], dtype='int64')