Issue 2280: parser module chokes on unusual characters (original) (raw)

Issue2280

Created on 2008-03-12 18:33 by dbinger, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
parsermodule.patch dbinger,2008-03-12 18:33 patch with unit test and proposed change
Messages (3)
msg63482 - (view) Author: David Binger (dbinger) Date: 2008-03-12 18:33
This is with the current revision of py3k: 61353. parser.suite('"\u1234"') fails with a TypeError. Changing the argument format from "s" to "s#" works around this problem. I added a unit test for this. After fixing the "s#", another bug is exposed by the same test: a string literal containing \u1234 is mangled by sequence2st(). The last section of the patch seems to correct the second bug. (I think getarg.c's handling of "s" has a problem handling a unicode string containing a character whose encoding is not 1 byte. It has a test for null bytes at the end that does not work correctly.)
msg69564 - (view) Author: Kuba Fast (kfast) Date: 2008-07-11 21:35
I get no problem in 3.0b1. Should this be closed? >>> parser.suite('"\u1234"') <parser.st object at 0xb7ceebd0>
msg69578 - (view) Author: David Binger (dbinger) Date: 2008-07-12 03:07
On Jul 11, 2008, at 5:35 PM, Kuba Fast wrote: > I get no problem in 3.0b1. Should this be closed? I think so. It looks like this has been fixed. Thanks.
History
Date User Action Args
2022-04-11 14:56:31 admin set github: 46533
2008-07-12 18:38:47 benjamin.peterson set status: open -> closedresolution: out of date
2008-07-12 03:07:55 dbinger set messages: +
2008-07-11 21:35:49 kfast set nosy: + kfastmessages: +
2008-03-12 18:33:04 dbinger create