BUG: read_csv float_precision="round_trip" parser does not handle initial/trailing spaces · Issue #43713 · pandas-dev/pandas (original) (raw)

import io

import numpy as np import pandas as pd

DATA = """
id\tnum\n 1\t1.2 \n 1\t 2.1\n 2\t 1 \n 2\t 1.2 \n """

df = pd.read_csv( io.StringIO(DATA), float_precision="round_trip", skipinitialspace=True, sep="\t", header=0, dtype={1: np.float64}, ) print(df)

read_csv(..., float_precision="round_trip", ) does parse fields with initial/trailing space and raises and error

Traceback (most recent call last):
  File "pandas/_libs/parsers.pyx", line 1141, in pandas._libs.parsers.TextReader._convert_tokens
    false_values=false_values)
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ales/.config/JetBrains/PyCharmCE2021.2/scratches/scratch_2.py", line 14, in <module>
    df = pd.read_csv(
  File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 686, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 458, in _read
    data = parser.read(nrows)
  File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 1196, in read
    ret = self._engine.read(nrows)
  File "/home/ales/.envs/orange3/lib/python3.9/site-packages/pandas/io/parsers.py", line 2155, in read
    data = self._reader.read(nrows)
  File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read
    cdef:
  File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory
    else:
  File "pandas/_libs/parsers.pyx", line 941, in pandas._libs.parsers.TextReader._read_rows
    self.parser.line_fields[i] + \
  File "pandas/_libs/parsers.pyx", line 1073, in pandas._libs.parsers.TextReader._convert_column_data
    col_res, na_count = self._convert_with_dtype(
  File "pandas/_libs/parsers.pyx", line 1149, in pandas._libs.parsers.TextReader._convert_tokens
    
ValueError: cannot safely convert passed user dtype of float64 for object dtyped data in column 1

float_precision="round_trip" should work the same as other float parsers.