Pandas isin with empty dataframe produces epoch value on datetime type instead of boolean · Issue #15473 · pandas-dev/pandas (original) (raw)
I've noticed that doing an isin on a DataFrame which contains datetime types where the operand is an empty DataFrame produces epoch datetime values (i.e. 1970-01-01), instead of 'False'. It seems unlikely that this is correct?
The following code demonstrates this:
import pandas as pd
data = {'date': ['2014-05-01 18:47:05.069722', '2014-05-01 18:47:05.119994', '2014-05-02 18:47:05.178768']} data2 = {'date': ['2014-05-01 18:47:05.069722', '2014-05-01 18:47:05.119994']} df = pd.DataFrame(data, columns = ['date']) df['date'] = pd.to_datetime(df['date']) df2 = pd.DataFrame(data2, columns = ['date']) df2['date'] = pd.to_datetime(df2['date']) df3 = pd.DataFrame([], columns = ['date']) df4 = pd.DataFrame()
print df.isin(df2) print df.isin(df3) print df.isin(df4)
This outputs:
date
0 True
1 True
2 False
date
0 1970-01-01
1 1970-01-01
2 1970-01-01
date
0 1970-01-01
1 1970-01-01
2 1970-01-01
I would normally expect a list of False values instead of '1970-01-01'? I notice that with pandas = 0.16.2 and numpy = 1.9.2, df.isin(df3) produces the more expected:
date
0 False
1 False
2 False
But df.isin(df4) is as previous.
INSTALLED VERSIONS ------------------ commit: None python: 2.7.13.final.0 python-bits: 64 OS: Linux OS-release: 2.6.32-279.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 2.2
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: 5.2.2
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None