[Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) (original) (raw)
Xavier Morel catch-all at masklinn.net
Thu Mar 7 11:31:03 CET 2013
- Previous message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Next message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2013-03-07, at 11:08 , Matej Cepl wrote:
On 2013-03-06, 18:34 GMT, Victor Stinner wrote:
In short, Unicode was rewritten in Python 3.3 for the PEP 393. It's not surprising that minor details like singleton differ. You should not use "is" to compare strings in Python, or your program will fail on other Python implementations (like PyPy, IronPython, or Jython) or even on a different CPython version. I am sorry, I don't understand what you are saying. Even though this has been changed to https://github.com/mcepl/html2text/blob/fixtests/html2text.py#L90 the tests still fail. But, Amaury is right: the function doesn't make much sense. However, ... when I have “fixed it” from https://github.com/mcepl/html2text/blob/master/html2text.py#L95 def onlywhite(line): """Return true if the line does only consist of whitespace characters.""" for c in line: if c is not ' ' and c is not ' ': return c is ' ' return line to https://github.com/mcepl/html2text/blob/fixtests/html2text.py#L90 def onlywhite(line): """Return true if the line does only consist of whitespace characters.""" for c in line: if c != ' ' and c != ' ': return c == ' ' return line
The second test looks like some kind of corruption, it's supposedly iterating on the characters of a line yet testing for two spaces? Is it possible that the original was a literal tab embedded in the source code (instead of '\t') and that got broken at some point?
According to its name + docstring, the implementation of this method
should really be replaced by return line and line.isspace()
(the first
part being to handle the case of an empty line: in the current
implementation the line will be returned directly if no whitespace is
found, which will be "negative" for an empty line, and ''.isspace() ->
false). Does that fix the failing tests?
- Previous message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Next message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]