SUB-character in a csv causes read_csv() with C-Engine to detect EOF · Issue #16893 · pandas-dev/pandas (original) (raw)
Problem description
If there is a SUB-character in a string in a csv, read_csv()
with the standard C-engine returns
ParserError: Error tokenizing data. C error: EOF inside string starting at line 0
The Python-engine can read the file fine.
It seems I can't put example data with a SUB-character here, so I pasted an example line here instead:
https://pastebin.com/x6QPY4Hf
Just paste the line into a csv and try to read it with read_csv()
.
I don't know if this behaviour is expected or not since this character is indeed used as EOF in certain cases, however I see little sense in having a SUB character interpreted as EOF in the middle of a csv file.
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
pandas: 0.20.2