COMPAT: clarify Index integer conversions when dtype is specified in construction · Issue #15187 · pandas-dev/pandas (original) (raw)

so [8], and [9] are our current model, IOW, we make an effort to convert to the specified type, but will coerce to an available type if the data is not valid for that dtype.

ideally we would also be consistent w.r.t. #15145, IOW Series construction with a specified dtype (not that we upcast to the available Index types but don't do this for Series, e.g. [18])

So we should be consistent on this.

In [8]: Index([np.nan],dtype='int64')
Out[8]: Float64Index([nan], dtype='float64')

In [9]: Index([np.nan],dtype='uint64')
Out[9]: Float64Index([nan], dtype='float64')

In [10]: Index([np.iinfo(np.int64).max-1],dtype='int64')
Out[10]: Int64Index([9223372036854775806], dtype='int64')

In [11]: Index([np.iinfo(np.int64).max-1],dtype='uint64')
Out[11]: UInt64Index([9223372036854775806], dtype='uint64')

# I guess this should convert to Float64Index?
In [12]: Index([np.iinfo(np.uint64).max-1],dtype='int64')
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-12-f2ec7d3c38a4> in <module>()
----> 1 Index([np.iinfo(np.uint64).max-1],dtype='int64')

/Users/jreback/pandas/pandas/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    318             # other iterable of some kind
    319             subarr = _asarray_tuplesafe(data, dtype=object)
--> 320             return Index(subarr, dtype=dtype, copy=copy, name=name, **kwargs)
    321 
    322     """

/Users/jreback/pandas/pandas/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    199                         inferred = lib.infer_dtype(data)
    200                         if inferred == 'integer':
--> 201                             data = np.array(data, copy=copy, dtype=dtype)
    202                         elif inferred in ['floating', 'mixed-integer-float']:
    203 

OverflowError: Python int too large to convert to C long

In [13]: Index([np.iinfo(np.uint64).max-1],dtype='uint64')
Out[13]: UInt64Index([18446744073709551614], dtype='uint64')

In [14]: Index([-1], dtype='int64')
Out[14]: Int64Index([-1], dtype='int64')

# this looks wrong
In [15]: Index([-1], dtype='uint64')
Out[15]: UInt64Index([18446744073709551615], dtype='uint64')

# we do this type of same-dtype upcasting already (this is correct / good thing)
In [18]: Index([np.iinfo(np.int32).max+1], dtype='int64')
Out[18]: Int64Index([2147483648], dtype='int64')