[Python-Dev] Expert floats (original) (raw)

Michael Chermside mcherm at mcherm.com
Wed Apr 7 10:51:49 EDT 2004


Ka-Ping writes:

How did the class react to floating-point? Seeing behaviour like this:

>>> 3.3 3.2999999999999998 >>> confused and frightened them, and continues to confuse and frighten almost everyone i teach.

As well it should, until they understand it. I'll come back to this point in a moment.

Everything in Python -- everything in computers, in fact -- is a model. We don't expect the model to be perfectly accurate or to be completely free of limitations. IEEE 754 happens to be the prevalent model for the real number line. We don't print every string with a message after it saying "WARNING: MAXIMUM LENGTH 4294967296", and we shouldn't do the same for floats.

Now here is where I disagree with you quite strongly. I am an experienced programmer, and I DO expect my computer programs to be "perfectly accurate". If I calculate 217 + 349, I expect to get 566 each and every time. If it sometimes came out to be 567 instead, I would throw a fit!

I also expect my program to be free of limitations. For instance, if I calculate 2147483640 + 9, I EXPECT to get 2147483649. In Python, I get 2147483649L instead... a minor wart that I can easily live with. If I am coding in C, I will instead get -2147483646, which is WAY off. I UNDERSTAND why this happens in C, and I am forced to live with it, but I don't LIKE or EXPECT it, and to be honest, trying to work around this limitation gives me headaches some days!

For another example, I realize that if I work with a large enough data set it won't fit into my machine's memory. But I would be very disapointed if someone designed a programming language that limited data sizes because of this. When I have a problem, I expect to be able to go out and buy more memory (or upgrade my machine, my OS, or whatever component is the limiting factor) and have things "just work".

And it's not just me... computer newbies have an even STRONGER belief that computers are "always accurate" than I do! They don't understand the degree to which human error and bugs can influence results, and most people figure that if the computer calculates some numbers, it's answer is guaranteed to be right.

So how can I stand to use floating point numbers? Well, the answer is that I am aware of the limitations of floating point calculations. I would never hesitate to use floating point to plot a graph, and I would never choose to use floating point to perform financial calculations. I'm not sufficiently well-versed in the details of floating point behavior to always know when floating point would be acceptable to use, so I take a conservative approach and use it only when I know that it is safe to use.

And I believe that this is exactly what your students ought to do.

Not everyone runs into floating-point corner cases. In fact, very few people do. I have never encountered such a problem in my entire history of using Python. And if you surveyed the user community, i'm sure you would find that only a small minority cares enough about the 17th decimal place for the discrepancy to be an issue.

That's simply untrue. Intel conducted (unintentionally) an experiment in just this question in 1994: try googling for "pentium bug". The conditions under which that bug affected results were far more obscure than the "corner cases" for floating point. And yet there was a huge hue and cry, and Intel was forced to replace tens of thousands of CPUs. Clearly, people DO care about tiny discrepencies.

So, to come back to my point at the beginning, let me tell you what I say when teaching Python to beginners and they complain about the odd display of floating point in Python. I tell them that when computers work with decimals, they store things internally in Binary (everyone "knows" that computers work only in 1s and 0s), and this causes tiny rounding errors. I tell them that Python makes a point of SHOWING you those rounding errors, because it doesn't want to lie to you.

Then, (depending on my audience) I usually follow up with a quick demo. I launch Microsoft Excel, and in the top cell I enter "-1". Then I enter "=1/9" (don't forget the "=" or Excel will treat it as a date) in each of the next nine cells. In the next cell I type "=sum(A1:A10)" but before pressing enter, I ask the class what the sum is. As soon as they're convinced it will be zero, I display the result. After that demo, people are usually much happier about Python which "looks funny", but "doesn't lie about its own mistakes".

I firmly believe that ANYONE (even beginners) who uses floating point needs to understand that its not exact. They don't need to know the details, but they need to grasp the concept that a computer is NOT an Oracle, and this demonstration helps.

-- Michael Chermside



More information about the Python-Dev mailing list