Series.map should return default dictionary values rather than NaN (original) (raw)

collections.Counter and collections.defaultdict both have default values. However, pandas.Series.map does not respect these defaults and instead returns missing values.

The issue is illustrated below:

import pandas from collections import Counter, defaultdict input = pandas.Series(range(5)) counter = Counter() counter[1] += 1 output = input.map(counter) expected = series.map(lambda x: counter[x]) pandas.DataFrame({ 'input': input, 'output': output, 'expected': expected, })

Here's the output:

   expected  input  output
0         0      0     NaN
1         1      1     1.0
2         0      2     NaN
3         0      3     NaN
4         0      4     NaN

The workaround is rather easy (lambda x: dictionary[x]) and shouldn't be to hard to implement. Are people on board with the change? Is there a performance concern with looking up each key independently?