pd.eval division operation upcasts float32 to float64 (original) (raw)

The current behavior is inconsistent with normal python division of two DataFrames (see code sample).

Pandas upcasts both terms to 64-bit floats when it detects a division, see:

https://github.com/pydata/pandas/blob/528108bba4104b939bcfe6923677ddacc916ff00/pandas/computation/ops.py#L453

I think numexpr can handle different types too, and upcast automatically, though I am not 100% sure. I can submit a PR, but how do you recommend fixing this? Something like the following?

if truediv or PY3:
    for term in com.flatten(self):
        try:
            dt = term.values.dtype  # can .values be expensive?    
        except AttributeError:
            dt = type(term)

        if dt == np.float32:
            continue        
        else:
            _cast_inplace([term], np.float_)

The downside is that if someone does 2 + df, they'll probably still end up upcasting it. But this proposal is still better than what we have today

I might re-write the above using filter too, but at this time I just wanted to discuss the general approach

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(3, dtype=np.float32))
print('normal', (df/df).values.dtype)
print('pd_eval', pd.eval('df/df').values.dtype)
assert ((df/df).dtypes == pd.eval('df/df').dtypes).all()

Expected Output

normal float32 
pd_eval float32

output of `pd.show_versions()`

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 3.16.0-4-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 8.0.2
setuptools: 19.6.2
Cython: 0.23.4
numpy: 1.10.4
scipy: None
statsmodels: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: 0.9.2
apiclient: 1.4.2
sqlalchemy: 1.0.9
pymysql: 0.6.7.None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
Jinja2: None

pd.eval division operation upcasts float32 to float64 (original) (raw)

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

output of `pd.show_versions()`