DataFrame.apply not working with datetimes · Issue #6125 · pandas-dev/pandas (original) (raw)
When you use apply on a DataFrame with datetimes in, the result is unexpected. This is a dataframe with just integers and strings and the result is that we get the market names back out.
positions = pd.DataFrame([[1, 'ABC', 50], [1, 'YUM', 20], [1, 'DEF', 20], [2, 'ABC', 50], [2, 'YUM', 20], [2, 'DEF', 20]], columns=['a', 'market', 'position']) positions.apply(lambda r: r['market'], axis=1) Out[210]: 0 ABC 1 YUM 2 DEF 3 ABC 4 YUM 5 DEF dtype: object
If we replace the data in column 'a' with datetimes, then we get the wrong result - the first value in the market column is repeated:
import datetime
positions = pd.DataFrame([[datetime.datetime(2013, 1, 1), 'ABC', 50], [datetime.datetime(2013, 1, 1), 'YUM', 20], [datetime.datetime(2013, 1, 1), 'DEF', 20], [datetime.datetime(2013, 1, 2), 'ABC', 50], [datetime.datetime(2013, 1, 2), 'YUM', 20], [datetime.datetime(2013, 1, 2), 'DEF', 20]], columns=['a', 'market', 'position']) positions.apply(lambda r: r['market'], axis=1) Out[213]: 0 ABC 1 ABC 2 ABC 3 ABC 4 ABC 5 ABC dtype: object
If you replace the lambda function with a function which prints the object passed in, then you can see that you only ever receive the first row of the dataframe:
def print_input(r): print r return 1
positions.apply(print_input, axis=1) a 2013-01-01 00:00:00 market ABC position 50 Name: 0, dtype: object a 2013-01-01 00:00:00 market ABC position 50 Name: 1, dtype: object a 2013-01-01 00:00:00 market ABC position 50 Name: 2, dtype: object a 2013-01-01 00:00:00 market ABC position 50 Name: 3, dtype: object a 2013-01-01 00:00:00 market ABC position 50 Name: 4, dtype: object a 2013-01-01 00:00:00 market ABC position 50 Name: 5, dtype: object Out[215]: 0 1 1 1 2 1 3 1 4 1 5 1 dtype: int64
This is new in the master, I didn't see it in pandas 0.11.0 or 0.13.0.