pd.merge fails on datetime columns with tzinfo · Issue #11405 · pandas-dev/pandas (original) (raw)

Since pandas-0.17 a merge on a datetime column fails if the datetime is tz-aware, see example below. Possibly related to #9663?

import pandas as pd
from datetime import datetime
from dateutil.tz import gettz
import sys, os
import traceback as tbm
# works
a = pd.DataFrame({'created' : [datetime(2015,10,10), 
                               datetime(2015,10,20)], 
                  'count' : [1,2]})
b = pd.DataFrame({'created' : [datetime(2015,10,10), 
                               datetime(2015,10,20)], 
                  'count' : [1,2]})
pd.merge(a, b, how='outer')
# doesn't work (used to work on pandas-0.16.2)
try:
    utc = gettz('UTC')
    a = pd.DataFrame({'created' : [datetime(2015,10,10, tzinfo=utc), 
                                   datetime(2015,10,20, tzinfo=utc)], 
                      'count' : [1,2]})
    b = pd.DataFrame({'created' : [datetime(2015,10,10, tzinfo=utc), 
                                   datetime(2015,10,20, tzinfo=utc)], 
                      'count' : [1,2]})
    pd.merge(a, b, how='outer')
except Exception as e:
    print "Yeah, doesn't work: %s" % e   
    _, _, tb = sys.exc_info()
    stack = lambda n : tbm.extract_tb(tb, 99)[n][0:]
    print "called from", stack(0)
    print "failing statement", stack(-1)
Yeah, doesn't work: type object argument after * must be a sequence, not itertools.imap
called from ('<ipython-input-194-3c3669b26a55>', 23, '<module>', u"pd.merge(a, b, how='outer')")
failing statement ('/.../local/lib/python2.7/site-packages/pandas/tools/merge.py', 516, '_get_join_indexers', 'llab, rlab, shape = map(list, zip( * map(fkeys, left_keys, right_keys)))')

the culprit seems to be in the call to _factorize_keys though I couldn't quite figure out what goes wrong.

$ python --version
Python 2.7.6
$ pip freeze | grep pandas
pandas==0.17.0