PythonParser::_check_thousands appears broken · Issue #4596 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Description
This code appears broken:
def _check_thousands(self, lines):
if self.thousands is None:
return lines
nonnum = re.compile('[^-^0-9^%s^.]+' % self.thousands)
ret = []
for l in lines:
rl = []
for x in l:
if (not isinstance(x, compat.string_types) or
self.thousands not in x or
nonnum.search(x.strip())):
rl.append(x)
else:
rl.append(x.replace(',', ''))
ret.append(rl)
return ret
It looks like the thousands
argument to the class is used to check if the value is "non numeric" but then a hard coded comma is used when actually performing the cleaning.
In addition to fixing this, I would recommend factoring out this method so that it can be used elsewhere.