[Python-Dev] Status of the fix for the hash collision vulnerability (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sun Jan 15 06:11:44 CET 2012

Previous message: [Python-Dev] Status of the fix for the hash collision vulnerability
Next message: [Python-Dev] Status of the fix for the hash collision vulnerability
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sun, Jan 15, 2012 at 2:42 PM, Steven D'Aprano <steve at pearwood.info> wrote:

Victor Stinner wrote:

- Marc Andre Lemburg proposes to fix the vulnerability directly in dict (for any key type). The patch raises an exception if a lookup causes more than 1000 collisions.

Am I missing something? How does this fix the vulnerability? It seems to me that the only thing this does is turn one sort of DOS attack into another sort of DOS attack: hostile users will just cause hash collisions until an exception is raised and the application falls over. Catching these exceptions, and recovering from them (how?), would be the responsibility of the application author. Given that developers are unlikely to ever see 1000 collisions by accident, or even realise that it could happen, I don't expect that many people will do that -- until they personally get bitten.

As I understand it, the way the attack works is that a single malicious request from the attacker can DoS the server by eating CPU resources while evaluating a massive collision chain induced in a dict by attacker supplied data. Explicitly truncating the collision chain boots them out almost immediately (likely with a 500 response for an internal server error), so they no longer affect other events, threads and processes on the same machine.

In some ways, the idea is analogous to the way we implement explicit recursion limiting in an attempt to avoid actually blowing the C stack

we take a hard-to-detect-and-hard-to-handle situation (i.e. blowing the C stack or malicious generation of long collision chains in a dict) and replace it with something that is easy to detect and can be handled by normal exception processing (i.e. a recursion depth exception or one reporting an excessive number of slot collisions in a dict lookup).

That then makes the default dict implementation safe from this kind of attack by default, and use cases that are getting that many collisions legitimately can be handled in one of two ways:

switch to a more appropriate data type (if you're getting that many collisions with benign data, a dict is probably the wrong container to be using)
offer a mechanism (command line switch or environment variable) to turn the collision limiting off

Now, where you can still potentially run into problems is if a single shared dict is used to store both benign and malicious data - if the malicious data makes it into the destination dict before the exception finally gets triggered, and then benign data also happens to trigger the same collision chain, then yes, the entire app may fall over. However, such an app would have been crippled by the original DoS anyway, since its performance would have been gutted - the collision chain limiting just means it will trigger exceptions for the cases that would been insanely slow.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

Previous message: [Python-Dev] Status of the fix for the hash collision vulnerability
Next message: [Python-Dev] Status of the fix for the hash collision vulnerability
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list