Series.str.casefold · Issue #25405 · pandas-dev/pandas (original) (raw)

Series.str.lower implements str.lower, as expected. There are also corresponding Series methods for the other Python 2 string casing methods. However, Python 3's str.casefold is missing. Casefold improves string equality and other comparisons, because it handles a greater variety of characters, as per the Unicode Standard. It'd be nice to have Series.str.casefold in Pandas.

The current alternative is more verbose and is slower than Series.str.lower.

pd.Series(s.casefold() if isinstance(s, str) else s for s in series)

Further, this alternative encourages a frustrating mistake -- forgetting to keep the original Series' index, which causes trouble if the new Series needs to be inserted into the same DataFrame as the original.

Apologies for double-posting. I used the wrong account for #25404 .