[Python-Dev] Decimal <-> float comparisons in py3k. (original) (raw)

Bob Ippolito bob at redivi.com
Tue Mar 23 16:43:11 CET 2010


On Sat, Mar 20, 2010 at 4:38 PM, Mark Dickinson <dickinsm at gmail.com> wrote:

On Sat, Mar 20, 2010 at 7:56 PM, Guido van Rossum <guido at python.org> wrote:

I propose to reduce all hashes to the hash of a normalized fraction, which we can define as a combination of the hashes for the numerator and the denominator. Then all we have to do is figure fairly efficient ways to convert floats and decimals to normalized fractions (not necessarily Fractions). I may be naive but this seems doable: for a float, the denominator is always a power of 2 and removing factors of 2 from the denominator is easy (just right-shift until the last bit is zero). For Decimal, the unnormalized denominator is always a power of 10, and the normalization is a bit messier, but doesn't seem excessively so. The resulting numerator and denominator may be large numbers, but for typical use of Decimal and float they will rarely be excessively large, and I'm not too worried about slowing things down when they are (everything slows down when you're using really large integers anyway). I am worried about slowing things down for large Decimals:  if you can't put Decimal('1e1234567') into a dict or set without waiting for an hour for the hash computation to complete (because it's busy computing 10**1234567), I consider that a problem. But it's solvable!  I've just put a patch on the bug tracker: http://bugs.python.org/issue8188 It demonstrates how hashes can be implemented efficiently and compatibly for all numeric types, even large Decimals like the above. It needs a little tidying up, but it works.

I was interested in how the implementation worked yesterday, especially given the lack of explanation in the margins of numeric_hash3.patch. numeric_hash4.patch has much better comments, but I didn't see this patch until after I had sufficiently deciphered the previous patch and wrote most of this: http://bob.pythonmac.org/archives/2010/03/23/py3k-unified-numeric-hash/

I'm not really qualified to review the patch, what little formal math training I had has atrophied quite a bit over the years, but as far as I can tell it seems to work. The results also seem to match the Python implementations that I created.

-bob



More information about the Python-Dev mailing list