Issue 3745: _sha256 et al. encode to UTF-8 by default (original) (raw)

Created on 2008-09-01 09:27 by hagen, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (23)
msg72220 - (view) Author: Hagen Fürstenau (hagen) Date: 2008-09-01 09:27
Whereas openssl-based _hashlib refuses to accept unencoded strings: >>> _hashlib.openssl_sha256("\xff") Traceback (most recent call last): File "", line 1, in TypeError: object supporting the buffer API required the _sha256 version encodes to UTF-8 by default: >>> _sha256.sha256("\xff").digest() == _sha256.sha256("\xff".encode("utf-8")).digest() True I think refusing is better, but at least the behaviour should be consistent. Same for the other algorithms in hashlib.
msg73550 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-09-22 01:01
agreed. most platforms should be using the openssl version, i will update the non-openssl implementations to behave the same. I don't think this is worth being a release blocker. I'll do it for 3.0.1.
msg79112 - (view) Author: Hagen Fürstenau (hagen) Date: 2009-01-05 08:48
Seems that this problem is being taken care of in issue #4751.
msg79115 - (view) Author: Lukas Lueg (ebfe) Date: 2009-01-05 09:47
solved in #4818 and #4821
msg81726 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-02-12 07:36
Fixed in py3k branch r69524. needs porting to release30-maint. possibly also release26-maint and trunk.
msg81738 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-02-12 11:18
I don't think backporting to 2.6 is fine, people may be relying on the current behaviour. As for 3.0.1, you'd better be quick, it's scheduled for tomorrow.
msg81739 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-12 11:22
Wooops, my mouse clicked on Remove!? I removed Message73550, sorry gregory. Here was the content of the message: --- agreed. most platforms should be using the openssl version, i will update the non-openssl implementations to behave the same. I don't think this is worth being a release blocker. I'll do it for 3.0.1. ---
msg81740 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-12 11:25
I agree with pitrou: leave python 2.6 unchanged, but please backport to 3.0.1 ;-)
msg81741 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-12 11:32
gpolo gave me the solution to restore a deleted message: http://bugs.python.org/issueXXXX?@action=edit&@add@messages=MSGNUM
msg81820 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-02-12 21:15
fixed in release30-maint r69555. sounds like its out of the question for 2.6. i will backport it to trunk.
msg81858 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-02-13 03:01
fixed in trunk r69561.
msg96431 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-12-15 11:22
Gregory, this patch should not have been backported to Python 2.7. See issue Could you please revert the change on trunk ? Thanks. A much better solution would be to issue a -3 warning in case a Unicode object is passed to the hash functions. However, this is major work to get right, since the "s#" parser marker also accepts buffer interfaces.
msg96501 - (view) Author: Roumen Petrov (rpetrov) * Date: 2009-12-16 23:41
What about inconsistent module build - as is reported some platform build sha256 module that support unicode but most it is not build if openssl is version 0.8+. Same for sha512 module. If unicode for hashlib is not acceptable for trunk than why is not build always sha{256|512} without to check for openssl version number ?
msg96934 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-12-28 02:13
lemburg - see which issue #? Anyways perhaps the right thing to do instead of trunk r65961 would have been to change the s# to an s*. Undoing it will be more painful now as several changes have gone in since that require undoing and possibly redoing differently.
msg96935 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-12-28 02:25
rpetrov - I couldn't really understand your message so I'm not sure if I'm answering the right things: yes both the openssl and non-openssl modules need to behave identically. the reason openssl is used when possible is that its optimized hash functions are several times faster than the plain C versions in the individual modules.
msg96936 - (view) Author: Karen Tracey (kmtracey) Date: 2009-12-28 03:04
I think the missing issue reference is to this thread on python-dev: http://mail.python.org/pipermail/python-dev/2009-December/094574.html
msg96991 - (view) Author: Roumen Petrov (rpetrov) * Date: 2009-12-29 10:35
gregory - refer to setup.py logic to build modules
msg96995 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-12-29 13:37
Gregory P. Smith wrote: > > Gregory P. Smith <greg@krypto.org> added the comment: > > lemburg - see which issue #? Sorry, the message got truncated for some reason. I was referring to http://bugs.python.org/issue3745 This was discussed on python-dev: http://mail.python.org/pipermail/python-dev/2009-December/094593.html > Anyways perhaps the right thing to do instead of trunk r65961 would have > been to change the s# to an s*. That would have worked as well. > Undoing it will be more painful now as several changes have gone in since > that require undoing and possibly redoing differently. Using s* should pretty much avoid the need to use GET_BUFFER_VIEW_OR_ERROUT(), so if you want to keep the other changes, removing the use of the macro should be fairly straight-forward, unless I'm missing something.
msg97151 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-01-02 22:33
trunk r77252 switches python 2.7 to use 's*' for argument parsing. unicodes can be hashed (encoded to the system default encoding by s*) again. This change has been blocked from being merged into py3k unless someone decides we actually want this magic unicode encoding behavior to exist there as well. setup.py has also been updated to compile all versions of the hash algorithm modules when Py_DEBUG is defined. I'll update tests run on all implementations next so that it is easier for developers to maintain identical behavior across all implementations without needing to explicitly remember to reconfigure their setup and test those.
msg97152 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-01-02 22:38
In order to get a -3 PyErr_WarnPy3k warning for unicode being passed to hashlib objects (a nice idea) I suggest creating an additonal 's*' like thing ('s3' perhaps?) in Python/getargs.c for that purpose rather than modifying all of the hashlib modules to accept an O, type check it and warn, and then re-parse it as a s* (that'd be a lot of tedious code duplication).
msg97153 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-01-03 00:48
I believe everything in here has been addressed. Please open new issues with details for anything that doesn't quite right.
msg97202 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-01-04 11:43
Gregory P. Smith wrote: > > Gregory P. Smith <greg@krypto.org> added the comment: > > trunk r77252 switches python 2.7 to use 's*' for argument parsing. unicodes can be hashed (encoded to the system default encoding by s*) again. > > This change has been blocked from being merged into py3k unless someone decides we actually want this magic unicode encoding behavior to exist there as well. Thanks for updating the implementation.
msg97203 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-01-04 11:49
Gregory P. Smith wrote: > > Gregory P. Smith <greg@krypto.org> added the comment: > > In order to get a -3 PyErr_WarnPy3k warning for unicode being passed to hashlib objects (a nice idea) I suggest creating an additonal 's*' like thing ('s3' perhaps?) in Python/getargs.c for that purpose rather than modifying all of the hashlib modules to accept an O, type check it and warn, and then re-parse it as a s* (that'd be a lot of tedious code duplication). Good idea. We're likely going to need this in more places, so I'm +1 on adding an "s3" parser marker.
History
Date User Action Args
2022-04-11 14:56:38 admin set github: 47995
2010-01-04 11:49:27 lemburg set messages: +
2010-01-04 11:43:58 lemburg set messages: +
2010-01-03 00:48:11 gregory.p.smith set status: open -> closedresolution: fixedmessages: +
2010-01-02 22:38:53 gregory.p.smith set messages: +
2010-01-02 22:33:31 gregory.p.smith set messages: +
2009-12-29 13:37:13 lemburg set messages: +
2009-12-29 10:35:09 rpetrov set messages: +
2009-12-28 03:04:50 kmtracey set messages: +
2009-12-28 02:25:52 gregory.p.smith set messages: +
2009-12-28 02:13:53 gregory.p.smith set messages: + versions: - Python 3.0, Python 3.1
2009-12-16 23:41:39 rpetrov set nosy: + rpetrovmessages: +
2009-12-15 14:44:52 kmtracey set nosy: + kmtracey
2009-12-15 11:22:55 lemburg set status: closed -> opennosy: + lemburgmessages: + resolution: fixed -> (no value)
2009-02-13 03:01:29 gregory.p.smith set status: open -> closedmessages: + resolution: fixedcomponents: + Extension Modules, - Library (Lib)versions: + Python 3.0, Python 3.1
2009-02-12 21:16:00 gregory.p.smith set keywords: + 26backportmessages: + versions: - Python 2.6, Python 3.0
2009-02-12 11:32:34 vstinner set messages: +
2009-02-12 11:31:46 vstinner set messages: +
2009-02-12 11:25:24 vstinner set messages: +
2009-02-12 11:22:04 vstinner set nosy: + vstinnermessages: +
2009-02-12 11:21:05 vstinner set messages: -
2009-02-12 11🔞03 pitrou set nosy: + pitroumessages: +
2009-02-12 07:36:56 gregory.p.smith set priority: normalmessages: + versions: + Python 2.6, Python 2.7
2009-01-05 09:47:36 ebfe set nosy: + ebfemessages: +
2009-01-05 08:48:17 hagen set messages: +
2008-09-22 01:01:44 gregory.p.smith set assignee: gregory.p.smithmessages: + nosy: + gregory.p.smith
2008-09-01 09:27:05 hagen create