[Python-Dev] Make str/bytes hash algorithm pluggable? (original) (raw)

Christian Heimes christian at python.org
Thu Oct 3 20:42:28 CEST 2013


Hi,

some of you may have seen that I'm working on a PEP for a new hash API and new algorithms for hashing of bytes and str. The PEP has three major aspects. It introduces DJB's SipHash as secure hash algorithm, chances the hash API to process blocks of data instead characters and it adds an API to make the algorithm pluggable. A draft is already available [1].

Now I got some negative feedback on the 'pluggable' aspect of the new PEP on Twitter [2]. I like to get feedback from you before I finalize the PEP.

The PEP proposes a pluggable hash API for a couple of reasons. I like to give users of Python a chance to replace a secure hash algorithm with a faster hash algorithm. SipHash is about as fast as FNV for common cases as our implementation of FNV process only 8 to 32 bits per cycle instead of 32 or 64. I haven't actually benchmarked how a faster hash algorithm affects the a real program, though ...

I also like to make it easier to replace the hash algorithm with a different one in case a vulnerability is found. With the new API vendors and embedders have an easy and clean way to use their own hash implementation or an optimized version that is more suitable for their platform, too. For example a mobile phone vendor could provide an optimized implementation with ARM NEON intrinsics.

On which level should Python support a pluggable hash algorithm?

  1. Compile time option: The hash code is compiled into Python's core. Embedders have to recompile Python with different options to replace the function.

  2. Library option: A hash algorithm can be added and one avaible hash algorithm can be set before Py_Initialize() is called for the first time. The approach gives embedders the chance the set their own algorithm without recompiling Python.

  3. Startup options: Like 2) plus an additional environment variable and command line argument to select an algorithm. With a startup option users can select a different algorithm themselves.

Christian

[1] http://www.python.org/dev/peps/pep-0456/ [2] https://twitter.com/EDEADLK/status/385572395777818624



More information about the Python-Dev mailing list