BUG: read_csv float_precision="round_trip" parser does not handle initial/trailing spaces · Issue #43713 · pandas-dev/pandas (original) (raw)
import io
import numpy as np import pandas as pd
DATA = """
id\tnum\n
1\t1.2 \n
1\t 2.1\n
2\t 1 \n
2\t 1.2 \n
"""
df = pd.read_csv( io.StringIO(DATA), float_precision="round_trip", skipinitialspace=True, sep="\t", header=0, dtype={1: np.float64}, ) print(df)
read_csv(..., float_precision="round_trip", )
does parse fields with initial/trailing space and raises and error
Traceback (most recent call last):
File "pandas/_libs/parsers.pyx", line 1141, in pandas._libs.parsers.TextReader._convert_tokens
false_values=false_values)
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ales/.config/JetBrains/PyCharmCE2021.2/scratches/scratch_2.py", line 14, in <module>
df = pd.read_csv(
File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 458, in _read
data = parser.read(nrows)
File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 1196, in read
ret = self._engine.read(nrows)
File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 2155, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read
cdef:
File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory
else:
File "pandas/_libs/parsers.pyx", line 941, in pandas._libs.parsers.TextReader._read_rows
self.parser.line_fields[i] + \
File "pandas/_libs/parsers.pyx", line 1073, in pandas._libs.parsers.TextReader._convert_column_data
col_res, na_count = self._convert_with_dtype(
File "pandas/_libs/parsers.pyx", line 1149, in pandas._libs.parsers.TextReader._convert_tokens
ValueError: cannot safely convert passed user dtype of float64 for object dtyped data in column 1
float_precision="round_trip"
should work the same as other float parsers.