[Python-Dev] license issues with profiler.py and md5.h/md5c.c (original) (raw)
Donovan Baarda abo at minkirri.apana.org.au
Fri Feb 18 05:09:51 CET 2005
- Previous message: [Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c
- Next message: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 2005-02-16 at 22:53 -0800, Gregory P. Smith wrote:
fyi - i've updated the python sha1/md5 openssl patch. it now replaces the entire sha and md5 modules with a generic hashes module that gives access to all of the hash algorithms supported by OpenSSL (including appropriate legacy interface wrappers and falling back to the old code when compiled without openssl).
https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&groupid=5470&atid=305470 I don't quite like the module name 'hashes' that i chose for the generic interface (too close to the builtin hash() function). Other suggestions on a module name? 'digest' comes to mind.
I just had a quick look, and have these comments (psedo patch review?). Apologies for the noise on the list...
DESCRIPTION
This patch keeps the current md5c.c, md5module.c files and adds the following; _hashopenssl.c, hashes.py, md5.py, sha.py.
The old md5 and sha extension modules get replaced by hashes.py, md5.py, and sha.py python modules that leverage off _hash (openssl) or _md5 and _sha (no openssl) extension modules.
The new _hash extension module "wraps" the high level openssl EVP interface, which uses a string parameter to indicate what type of message digest algorithm to use. The advantage of this is it makes all openssl supported digests available, and if openssl adds more, we get them for free. A disadvantage of this is it is an abstraction level above the actual md5 and sha implementations, and this may add overheads. These overheads are probably negligible compared to the actual implementation speedups.
The new _md5 and _sha extension modules are simply re-named versions of the old md5 and sha modules.
The hashes.py module acts as an import wrapper for _hash, and falls back to using _md5 and _sha modules if _hash is not available. It provides an EVP style API (string hash name parameter), that supports only md5 and sha hashes if openssl is not available.
The new md5.py and sha.py modules simply use hash.py.
COMMENTS
The introduction of a "hashes" module with a new API that supports many different digests (provided openssl is available) is extending Python, not just "fixing the licenses" of md5 and sha modules.
If all we wanted to do was fix the md5 module, a simpler solution would be to change the md5c.c API to match openssl's implementation, and make md5module.c use it, conditionally compiling against md5c.c or linking against openssl in setup.py. A similar approach could be used for sha, but would require stripping the sha implementation out of shamodule.c
I am mildly of concerned about the namespace/filespace clutter introduced by this implementation... it feels unnecessary, as does the tangled dependencies between them. With openssl, hashes.py duplicates the functionality of _hash. Without openssl, md5.py and sha.py duplicate _md5 and _sha, via a roundabout route through hash.py.
The python wrappers seem overly complicated, with things like
def new(name, string=None): if string: return _hash.new(name) else: return _hash.new.(name,string)
being common where the following would suffice;
def new(name,string=""): return _hash.new(name,string)
I think this is because _hash.new() uses an optional string parameter, but I have a feeling a C update with a zero length string is faster than this Python if. If it was a concern, the C implementation could check the value of the string length before calling update.
Given the convenience methods for different hashes in hashes.py (which incidentally look like they are only available when _hash is not available... something else that needs fixing), the md5.py module could be simply coded as;
from hashes import md5 new = md5
Despite all these nit-picks, it looks pretty good. It is orders of magnitude better than any of the other non-existent solutions, including the one I didn't code :-)
-- Donovan Baarda <abo at minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/
- Previous message: [Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c
- Next message: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]