DataFrame.replace(dict) has weird behaviour in some cases · Issue #5338 · pandas-dev/pandas (original) (raw)
import pandas as pd df = pd.DataFrame({"color":[1,2,3,4]}) print df color 0 1 1 2 2 3 3 4 print df.replace({"color":{"1":"2","3":"4",}}) # works but shouldn't? color 0 2 1 2 2 4 3 4 print df.replace({"color":{"1":"2","2":"3","3":"4","4":"5"}}) # strange color 0 2 1 4 2 3 3 5 print df.replace({"color":{1:"2",2:"3",3:"4",4:"5"}}) # works by replacing each cell once color 0 2 1 3 2 4 3 5
df = pd.DataFrame({"color":["1","2","3","4"]}) print df color 0 1 1 2 2 3 3 4 print df.replace({"color":{"1":"2","3":"4",}}) # works color 0 2 1 2 2 4 3 4 print df.replace({"color":{"1":"2","2":"3","3":"4","4":"5"}}) # works not color 0 3 1 3 2 5 3 5 print df.replace({"color":{1:"2",2:"3",3:"4",4:"5"}}) # works as expected: shouldn't replace anything! color 0 1 1 2 2 3 3 4
So, my expected behaviour would be:
- don't replace a cell if the type of the cell does not match the key (as it is the case when a string cell is replaced by a int key)
- if a value of a cell is replaced, the cell shouldn't be replaced a second time in the same replace call
I found the problem when I tried to match string values to colors and got blown up color values: like {"3":"#123456","4":"#000000"}
wouldn't convert "3"
into "#123#00000056"
Edit: insert string cell cases and my expected behaviour and deleted the intial comments which had these examples