bltinmodule.c (original) (raw)

Terry Reedy tjreedy at udel.edu
Fri Jun 27 00:00:23 CEST 2008

Raymond Hettinger wrote:

From: "Guido van Rossum" <guido at python.org>

Let's step back and discuss the API some more.

- Do we need all three? I think so -- see the the reasons below.

I would prefer 1, see below.

Of course, my first choice was not on your list. To me, the one obvious way to convert a number to a eval-able string in a different base is to use bin(), oct(), or hex(). But that appears to be off the table for reasons that I've read but don't make any sense to me.

Let me try. I am one of those who prefer smaller to bigger for the core language to make it easier to learn and teach. But, to me, there deeper consideration that applies here. A Python interpreter, human or mechanical, must do exact integer arithmetic. But a Python interpreter does not have to convert float literals to fixed size binary and does not have to do float arithmetic with binary presentations that are usually approximations. (Indeed, human interpreters do neither, which is why they are often surprised at CPython's float output, and which is why this function will be useful.) If built-in functions are part of the language definition, as Guido just clarified, their definition and justification should not depend on the float implementation.

It seems simple enough, extendable enough, and clean enough for bin/oct/hex to use index if present and float if not.

To me, a binary representation, in whatever base, of a Decimal is senseless. The point of this issue is to reveal the exact binary bit pattern of float instances.

- If so, why not .tobase(N)? (Even if N is restricted to 2, 8 and 16.) I don't think it's user-friendly to have the float-to-bin API fail to parallel the int-to-bin API. IMO, it should be done the same way in both places.

I would like to turn this around. I think that 3 nearly identical built-ins is 2 too many. I am going to propose on the Py3 list that bin, oct, and hex be condensed to one function, bin(integer, base=2,8,or16), for 3.1 if not 3.0. Base 8 and 16 are, to me, compressed binary.

Three methods is definitely too many for a somewhat subsidiary function. So, I would like to see float.bin([base=2])

I don't find it attractive in appearance. Any use case I can imagine involves multiple calls using the same base and I would likely end-up using functools.partial or somesuch to factor-out the repeated use of the same variable.

Make the base that naive users want to see the default. I believe this to be 2. Numerical analysts who want base 16 can deal with partial if they really have scattered calls (as opposes to a few within loops) and cannot deal with typing '16' over and over.

bin(.6) '0b10011001100110011001100110011001100110011001100110011 * 2.0**-53' ... Both of those bits of analysis become awkward with the tobase() method: (.6).tobase(2)

Eliminate the unneeded parentheses and default value, and this is

.6.bin() which is just one extra char.

- What should the output format be? I know you originally favored 0b10101.010101 etc. Now that it's not overloaded on the bin/oct/hex builtins, the constraint that it needs to be an eval() able expression may be dropped (unless you see a use case for that too). The other guys convinced me that round tripping was important and that there is a good use case for being able to read/write precisely specified floats in a platform independent manner.

Definitely. The paper I referenced in the issue discussion, http://bugs.python.org/issue3008 mentioned a few times here, is http://hal.archives-ouvertes.fr/docs/00/28/14/29/PDF/floating-point-article.pdf

Also, my original idea didn't scale well without exponential notation -- i.e. bin(125E-100) would have a heckofa lot of leading zeroes. Terry and Mark also pointed-out that the hex with exponential notation was the normal notation used in papers on floating point arithmetic. Lastly, once I changed over to the new way, it dramatically simplified the implementation.

I originally thought I preferred the 'hexponential' notation that uses P for power instead of E for exponential. But with multiple bases, the redundancy of repeating the bases is ok, and being able to eval() without changing the parser is a plus. But I would prefer losing the spaces around the ** operator.

Terry Jan Reedy

More information about the Python-Dev mailing list