Message 152732 - Python tracker (original) (raw)

It is an issue. Not only in terms of startup time, but also because randomization per default makes Python behave in non-deterministc ways - which is not what you want from a programming language or interpreter (unless you explicitly tell it to behave like that).

That's debatable. For example id() is fairly unpredictable accross runs (except for statically-allocated instances).

I think it would be much better to just let the user define a hash seed using environment variables for Python to use and then forget about how this variable value is determined. If it's not set, Python uses 0 as seed, thereby disabling the seeding logic.

This approach would have Python behave in a deterministic way per default and still allow users who wish to use a different seed, set this to a different value - even on a case by case basis.

If you absolutely want to add a feature to have the seed set randomly, you could make a seed value of -1 trigger the use of a random number source as seed.

Having both may indeed be a good idea.

I also still firmly believe that the collision counting scheme should be made available via an environment variable as well. The user could then set the variable to e.g. 1000 to have it enabled with limit 1000, or leave it undefined to disable the collision counting.

With those two tools, users could then choose the method they find most attractive for their purposes.

It's not about being attractive, it's about fixing the security problem. The simple collision counting approach leaves a gaping hole open, as demonstrated by Frank.