BUG/API: Data assignment dtype inference inconsistencies · Issue #14001 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

Assuming we have int DataFrame and assign float values.

df = pd.DataFrame(np.zeros((3, 2), dtype=np.int64))
df.iloc[0:2, 0:2] = np.array([[1, 2.1], [2, 2.2]], dtype=np.float64)
df
#    0    1
#0  1  2.1
#1  2  2.2
#2  0  0.0
df = pd.DataFrame(np.zeros((3, 2), dtype=np.int64))
df.iloc[0:2, 0] = np.array([1, 2], dtype=np.float64)
df
#      0  1
#0  1.0  0
#1  2.0  0
#2  0.0  0

Expected Output

I think 1st one should be all float, as we're assigning float dtype.

df = pd.DataFrame(np.zeros((3, 2), dtype=np.int64))
df.iloc[0:2, 0:2] = np.array([[1, 2.1], [2, 2.2]], dtype=np.float64)
df
#      0    1
#0  1.0  2.1
#1  2.0  2.2
#2  0.0  0.0

The inference occurs around here, and it should be something like:

if hasattr(value, 'dtype'):
    if is_dtype_equal(values.dtype, value.dtype):
        dtype = value.dtype
    else:
        dtype = _find_common_type(values.dtpye, value.dtype)
elif is_scalar(value):
    dtype, _ = _infer_dtype_from_scalar(value)
else:
    # not sure can reach here...
    dtype = 'infer'
...

output of pd.show_versions()

0.18.1