[Python-Dev] [Python-checkins] cpython: #15927: Fix cvs.reader parsing of escaped \r\n with quoting off. (original) (raw)
Kristján Valur Jónsson kristjan at ccpgames.com
Wed Mar 20 04:16:53 CET 2013
- Previous message: [Python-Dev] cpython: Closes issue 17467. Add readline and readlines support to
- Next message: [Python-Dev] [Python-checkins] cpython: #15927: Fix cvs.reader parsing of escaped \r\n with quoting off.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
The compiler complains about this line: if (c == '\n' | c=='\r') {
Perhaps you wanted a Boolean operator?
-----Original Message----- From: Python-checkins [mailto:python-checkins-bounces+kristjan=ccpgames.com at python.org] On Behalf Of r.david.murray Sent: 19. mars 2013 19:42 To: python-checkins at python.org Subject: [Python-checkins] cpython: #15927: Fix cvs.reader parsing of escaped \r\n with quoting off.
http://hg.python.org/cpython/rev/940748853712 changeset: 82815:940748853712 parent: 82811:684b75600fa9 user: R David Murray <rdmurray at bitdance.com> date: Tue Mar 19 22:41:47 2013 -0400 summary: #15927: Fix cvs.reader parsing of escaped \r\n with quoting off.
This fix means that such values are correctly roundtripped, since cvs.writer already does the correct escaping.
Patch by Michael Johnson.
files: Lib/test/test_csv.py | 9 +++++++++ Misc/ACKS | 1 + Misc/NEWS | 3 +++ Modules/_csv.c | 13 ++++++++++++- 4 files changed, 25 insertions(+), 1 deletions(-)
diff --git a/Lib/test/test_csv.py b/Lib/test/test_csv.py --- a/Lib/test/test_csv.py +++ b/Lib/test/test_csv.py @@ -308,6 +308,15 @@ for i, row in enumerate(csv.reader(fileobj)): self.assertEqual(row, rows[i]) + def test_roundtrip_escaped_unquoted_newlines(self): + with TemporaryFile("w+", newline='') as fileobj: + writer = csv.writer(fileobj,quoting=csv.QUOTE_NONE,escapechar="\") + rows = [['a\nb','b'],['c','x\r\nd']] + writer.writerows(rows) + fileobj.seek(0) + for i, row in enumerate(csv.reader(fileobj,quoting=csv.QUOTE_NONE,escapechar="\")): + self.assertEqual(row,rows[i]) + class TestDialectRegistry(unittest.TestCase): def test_registry_badargs(self): self.assertRaises(TypeError, csv.list_dialects, None) diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -591,6 +591,7 @@ Fredrik Johansson Gregory K. Johnson Kent Johnson +Michael Johnson Simon Johnston Matt Joiner Thomas Jollans diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -289,6 +289,9 @@ Library
+- Issue #15927: CVS now correctly parses escaped newlines and carriage
- when parsing with quoting turned off.
- Issue #17467: add readline and readlines support to mock_open in unittest.mock.
diff --git a/Modules/_csv.c b/Modules/_csv.c --- a/Modules/_csv.c +++ b/Modules/_csv.c @@ -51,7 +51,7 @@ typedef enum { START_RECORD, START_FIELD, ESCAPED_CHAR, IN_FIELD, IN_QUOTED_FIELD, ESCAPE_IN_QUOTED_FIELD, QUOTE_IN_QUOTED_FIELD,
- EAT_CRNL
- EAT_CRNL,AFTER_ESCAPED_CRNL
} ParserState;
typedef enum { @@ -644,6 +644,12 @@ break;
case ESCAPED_CHAR:
if (c == '\n' | c=='\r') {
if (parse_add_char(self, c) < 0)
return -1;
self->state = AFTER_ESCAPED_CRNL;
break;
} if (c == '\0') c = '\n'; if (parse_add_char(self, c) < 0) @@ -651,6 +657,11 @@ self->state = IN_FIELD; break;
case AFTER_ESCAPED_CRNL:
if (c == '\0')
break;
/*fallthru*/
case IN_FIELD: /* in unquoted field */ if (c == '\n' || c == '\r' || c == '\0') {
-- Repository URL: http://hg.python.org/cpython
- Previous message: [Python-Dev] cpython: Closes issue 17467. Add readline and readlines support to
- Next message: [Python-Dev] [Python-checkins] cpython: #15927: Fix cvs.reader parsing of escaped \r\n with quoting off.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]