[Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) (original) (raw)
Amaury Forgeot d'Arc amauryfa at gmail.com
Wed Mar 6 15🔞30 CET 2013
- Previous message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Next message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
2013/3/6 Matěj Cepl <mcepl at redhat.com>
On 2013-02-26, 16:25 GMT, Terry Reedy wrote: > On 2/21/2013 4:22 PM, Matej Cepl wrote: >> as my method to commemorate Aaron Swartz, I have decided to port his >> html2text to work fully with the latest python 3.3. After some time >> dealing with various bugs, I have now in my repo >> https://github.com/mcepl/html2text (branch python3) working solution >> which works all the way to python 3.2 (inclusive; >> https://travis-ci.org/mcepl/html2text). However, the last problem >> remains. This >> >>
Run this command: >>ls -l *.html>>? >> >> should lead to >> >> * Run this command: >> >> ls -l *.html >> >> * ? >> >> but it doesn’t. It leads to this (with python 3.3 only) >> >> * Run this command: >> ls -l *.html >> >> * ? >> >> Does anybody know about something which changed in modules re or >> http://docs.python.org/3.3/whatsnew/changelog.html between 3.2 and >> 3.3, which could influence this script? > > Search the changelob or 3.3 misc/News for items affecting those two > modules. There are at least 4. > http://docs.python.org/3.3/whatsnew/changelog.html > > It is faintly possible that the switch from narrow/wide builds to > unified builds somehow affected that. Have you tested with 2.7/3.2 on > both narrow and wide unicode builds? So, in the end, I have went the long way and bisected cpython to find the commit which broke my tests, and it seems that the culprit is http://hg.python.org/cpython/rev/123f2dc08b3e so it is clearly something Unicode related. Unfortunately, it really doesn't tell me what exactly is broken (is it a known regression) and if there is known workaround. Could anybody suggest a way how to find bugs on http://bugs.python.org related to some particular commit (plain search for 123f2dc0 didn’t find anything).
I strongly suspect an incorrect usage of the "is" operator: https://github.com/mcepl/html2text/blob/master/html2text.py#L95 Identity of strings is not guaranteed...
Does it change something if you use "==" instead?
-- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20130306/5377c32c/attachment.html>
- Previous message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Next message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]