series finalized not correctly called in merge? · Issue #6923 · pandas-dev/pandas (original) (raw)
I got some help from Jeff on stackoverflow, but either I'm misunderstanding the way __finalized__
works, or there's a bug in how it's called. My intent was to preserve series metadata after 2 dataframes being merged, and I believe __finalize__
should be able to handle this.
I define a couple dataframes, and assign metadata values to all the series:
import numpy as np
import pandas as pd
np.random.seed(10)
df1 = pd.DataFrame(np.random.randint(0, 4, (3, 2)), columns=['a', 'b'])
df2 = pd.DataFrame(np.random.randint(0, 4, (3, 2)), columns=['c', 'd'])
df1
a b
0 1 1
1 0 3
2 0 1
df2
c d
0 3 0
1 1 1
2 0 1
Then I assign metadata field filename
to series
pd.Series._metadata = ['name', 'filename']
for c1 in df1:
df1[c1].filename = 'fname1.csv'
for c2 in df2:
df2[c2].filename = 'fname2.csv'
Now, I'm defining __finalize__
for series, which I understand is able to propagate metadata from one series to the other, for example when I want to merge. But when I define a __finalize__
that prints off the metadata that I've already assigned, it looks like by the time it calls __finalize__
, it no longer has the metadata.
def finalize_ser(self, other, method=None, **kwargs):
print 'Self meta: {}'.format(getattr(self, 'filename', None))
print 'Other meta: {}'.format(getattr(other, 'filename', None))
for name in self._metadata:
object.__setattr__(self, name, getattr(other, name, ''))
return self
pd.Series.__finalize__ = finalize_ser
When I call merge
, I never see the correct metadata printed off
df1.merge(df2, left_on=['a'], right_on=['c'], how='inner')
Self meta: None
Other meta: None
Self meta: None
Other meta: None
Self meta: None
Other meta: None
Self meta: None
Other meta: None
Out[5]:
a b c d
0 1 1 1 1
1 0 3 0 1
2 0 1 0 1
It appears the metadata is lost before it gets to the __finalize__
call, though it's still in the original series
df1.a.filename # => 'fname1.csv'
mgd.a.filename # => AttributeError
Is this expected or is there a bug?