[Python-Dev] Expert floats (original) (raw)

Tim Peters tim.one at comcast.net
Tue Mar 30 20:43:56 EST 2004


[Ping]

All right. Maybe we can make some progress.

Probably not -- we have indeed been thru all of this before.

I agree that round-to-12 was a real problem. But i think we are talking about two different use cases: compiling to disk and displaying on screen.

Those are two of the use cases, yes.

I think we can satisfy both desires.

If i understand you right, your primary aim is to make sure the marshalled form of any floating-point number yields the closest possible binary approximation to the machine value on the original platform,

I want marshaling of fp numbers to give exact (not approximate) round-trip equality on a single box, and across all boxes supporting the 754 standard where C maps "double" to a 754 double. I also want marshaling to preserve as much accuracy as possible across boxes with different fp formats, although that may not be practical. Strings have nothing to do with that, except for the historical accident that marshal happens to use decimal strings. Changing repr(float) to produce 17 digits went a very long way toward achieving all that at the time, with minimal code changes. The consequences of that change I really like didn't become apparent for years.

even when that representation is used on a different platform. (Is that correct? Perhaps it's best if you clarify -- exactly what is the invariant you want to maintain, and what changes [in platform or otherwise] do you want the representation to withstand?)

As above. Beyond exact equality across suitable 754 boxes, we'd have to agree on a parameterized model of fp, and explain "as much accuracy as possible" in terms of that. But you don't care, so I won't bother .

That doesn't have to be part of repr()'s contract. (In fact, i would argue that already repr() makes no such promise.)

It doesn't, but the docs do say:

If at all possible, this [repr's result] should look like a valid
Python expression that could be used to recreate an object with the
same value (given an appropriate environment).

This is possible for repr(float), and is currently true for repr(float) (on 754-conforming boxes).

repr() is about providing a representation for humans.

I think the docs are quite clear that this function belongs to str():

.... the ``informal'' string representation of an object.  This
differs from __repr__() in that it does not have to be a valid
Python expression: a more convenient or concise representation
may be used instead. The return value must be a string object.

Can we agree on maximal precision for marshalling,

I don't want to use strings at all for marshalling. So long as we do, 17 is already correct for that purpose (< 17 doesn't preserve equality, > 17 can't be relied on across 754-conforming C libraries).

and shortest-accurate precision for repr, so we can both be happy?

As I said before (again and again and again ), I'm the one who has fielded most newbie questions about fp since Python's beginning, and I'm very happy with the results of changing repr() to produce 17 digits. They get a little shock at the start now, but potentially save themselves from catastrophe by being forced to grow some necessary caution about fp results early.

So, no, we're not going to agree on this. My answer for newbies who don't know and don't care (and who are determined never to know or care) has always been to move to a Decimal module. That's less surprising than binary fp in several ways, and 2.4 will have it.

(By shortest-accurate i mean "the shortest representation that converts to the same machine number". I believe this is exactly what Andrew described as Scheme's method.

Yes, except that Scheme also requires that this string be correctly rounded to however many digits are produced. A string s just satisfying

eval(s) == some_float

needn't necessarily be correctly rounded to s's precision.

If you are very concerned about this being a complex and/or slow operation, a fine compromise would be a "try-12" algorithm: if %.12g is accurate then use it, and otherwise use %.17g. This is simple, easy to implement, produces reasonable results in most cases, and has a small bound on CPU cost.)

Also unique to Python.

def try12(x): rep = '%.12g' % x if float(rep) == x: return rep return '%.17g' % x

def shortestaccurate(x): for places in range(17): fmt = '%.' + str(places) + 'g' rep = fmt % x if float(rep) == x: return rep return '%.17g' % x

Our disagreement is more fundamental than that. Then again, it always has been <smile!>.



More information about the Python-Dev mailing list