Issue 7117: Backport py3k float repr to trunk (original) (raw)

Issue7117

Created on 2009-10-13 08:30 by mark.dickinson, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
round_fixup.patch mark.dickinson,2009-11-03 16:15 Backport of py3k round to trunk.
Messages (33)
msg93918 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 08:30
See the thread starting at: http://mail.python.org/pipermail/python-dev/2009-October/092958.html Eric suggested that we don't need a separate branch for this; sounds fine to me. It should still be possible to do the backport in stages, though. Something like the following? (1) Check in David Gay's code plus necessary build changes, configuration steps, etc; conversions still use the old code. (2) Switch to using the new code for float -> string (str, repr, float formatting) and string -> float conversions (float, complex constructors, numeric literals in Python code). [Substeps?] (3) Fix up builtin round function to use the new code. (4) Make any necessary fixes to the documentation. (Raymond, I assume you'll take care of the whatsnew changes when the time comes?) (1), (3) and (4) should be straightforward. (2) is where most of the work is, I think. I think it should be possible to do the stage (2) work in pieces without breaking too much.
msg94414 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 13:41
Some key revision numbers from the py3k short float repr, for reference: r71663: include Gay's code, build and configure fixes r71723: backout SSE2 detection code added in r71663 r71665: rewrite of float formatting code to use Gay's code Backported most of r71663 and r71723 to trunk in: r75651: Add dtoa.c, dtoa.h, update license. r75658: configuration changes - detect float endianness, add functions to get and set x87 control word, and determine when short float repr can be used. Significant changes from r71663 not yet included: * Misc/NEWS update * Lib/test/formatfloat_testcases.txt needs updating to match py3k.
msg94417 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 14:06
r75666: Add sys.float_repr_style attribute.
msg94428 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 16:02
r75672: temporarily disable the short float repr while we're putting the pieces in place. When testing, the disablement can be disabled (hah) by defining the PY_SHORT_FLOAT_REPR preprocessor symbol, e.g. (on Unix) with CC='gcc -DPY_SHORT_FLOAT_REPR' ./configure && make
msg94430 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-24 17:04
I think the next step on my side is to remove _PyOS_double_to_string, and make all of the internal code call PyOS_double_to_string. The distinction is that _PyOS_double_to_string gets passed a buffer and length, but PyOS_double_to_string returns allocated memory that must be freed. David Gay's code (_Py_dg_dtoa) returns allocated memory, so that's the easiest interface to propagate internally. That's the approach we used in the py3k branch. I'll start work on it. So Mark's work should be mostly config stuff and hooking up Gay's code to PyOS_double_to_string. I think it will basically match the py3k version. The existing _PyOS_double_to_string will become the basis for the fallback code for use when PY_NO_SHORT_FLOAT_REPR is defined (and it will then be renamed PyOS_double_to_string and have its signature changed to match).
msg94431 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 18:15
One issue occurs to me: should the backport change the behaviour of the round function? In py3k, round consistently uses round-half-to-even for halfway cases. In trunk, round semi-consistently uses round-half-away-from-zero (and this is documented). E.g., round(1.25, 1) will give 1.2 in py3k and (usually) 1.3 in trunk. I definitely want to use Gay's code for round in 2.7, since having round work sensibly is part of the motivation for the backport in the first place. But this naturally leads to a round-half-to-even version of round, since the Python-adapted version of Gay's code isn't capable of doing anything other than round-half-to-even at the moment. Options: (1) change round in 2.7 to do round-half-to-even. This is easy, natural, and means that round will agree with float formatting (which does round-half-to-even in both py3k and trunk). But it may break existing applications. However: (a) those applications would need fixing anyway to work with py3k, and (b) I have little sympathy for people depending on behaviour of rounding of *binary* floats for halfway *decimal* cases. (Decimal is another matter, of course: there it's perfectly reasonable to expect guaranteed rounding behaviour.) It's more complicated than that, though, since if rounding becomes round-half-to-even for floats, it should also change for integers, Fractions, etc. (2) have round stick with round-half-away-from-zero. This may be awkward to implement (though I have some half-formed ideas about how to make it work), and would lead to round occasionally not agreeing with float formatting. For example: >>> '{0:.1f}'.format(1.25) '1.2' >>> round(1.25, 1) 1.3
msg94433 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-24 18:29
Adding tim_one as nosy. He'll no doubt have an opinion on rounding. And hopefully he'll share it!
msg94436 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-24 21:17
Another thing to consider is that in py3k we removed all conversions of converting 'f' to 'g', such as this, from Objects/unicodeobject.c: if (type == 'f' && fabs(x) >= 1e50) type = 'g'; Should we also do that as part of this exercise? Or should it be another issue, or not done at all?
msg94447 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-10-25 05:56
+1 on backporting the 'f' and 'g' work also. We will be well served by getting the two code bases ins-sync with one another. Eliminating obscure differences makes it easier to port code from 2.x to 3.x
msg94491 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 16:03
r75720: Backport py3k version of pystrtod.c to trunk. There are still some (necessary) differences between the two versions, which should become unnecessary once everything else is hooked up. The differences should be re-examined later.
msg94494 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 17:01
[Eric, on removing f to g conversions] > Should we also do that as part of this exercise? Or should it be another > issue, or not done at all? I'd definitely like to remove the f to g conversion in trunk. I don't see any great need to open a separate issue for that. (Was there one already for the py3k removal?)
msg94495 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 17:03
Found it: issue #5859 was opened for the removal of the f -> g conversion in py3k. We could just add a note to that issue.
msg94510 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 21:19
r75730: backport pystrtod.h r75731: Fix floatobject.c to use PyOS_string_to_double.
msg94528 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 22:30
r75739: Fix complexobject.c to use PyOS_string_to_double.
msg94550 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 11:37
r75743: Fix cPickle.c to use PyOS_string_to_double.
msg94551 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 12:13
r75745: Fix stropmodule.c to use PyOS_string_to_double.
msg94572 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 18:37
r75824: Fix ast.c to use PyOS_string_to_double.
msg94575 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 19:43
r75846: Fix marshal.c to use PyOS_string_to_double.
msg94615 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-28 08:46
r75913: Fix _json.c to use PyOS_string_to_double. Change made after consulting with Bob Ippolito. This completes the removal of calls to PyOS_ascii_strtod.
msg94655 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-29 10:17
The next job is to deprecate PyOS_ascii_atof and PyOS_ascii_strtod, I think. I'll get to work on that.
msg94747 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-31 09:44
r75979: Deprecate PyOS_ascii_atof and PyOS_ascii_strtod; document PyOS_double_to_string.
msg94862 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-03 16:15
Here's a patch for correctly-rounded round in trunk. This patch doesn't change the rounding behaviour between 2.6 and 2.7: it's still doing round-half-away-from-zero instead of round-half-even. It was necessary to detect and treat halfway cases specially to make this work. Removing this special case code would be easy, so we can decide later whether it's worth changing round to do round-half-to-even for 2.7. I want to let this sit for a couple of days before I apply it.
msg95444 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-18 19:35
r76373: Backport round.
msg95507 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-19 18:44
Short float repr is now enabled in r76379. Misc/NEWS entries added/updated in r76411.
msg95644 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-23 18:48
r76465 removes the fixed-length buffer for formatting floats, hence removes the restriction on the precision. This should make removal of the %f -> %g switch straightforward.
msg95658 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-23 20:56
r76474: Remove %f -> %g switch.
msg95700 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-24 21:46
I think we're pretty much done here. I'd still like to produce a more complete set of float formatting test cases at some point (for both trunk and py3k), but that's a separate activity. Eric, Raymond: can you spot anything we've missed?
msg95703 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-11-24 22:46
Thanks for tackling the last few bits, Mark. I think we're done, although I admit I haven't verified what state the documentation is in. I suggest we close this issue and if any problems occur open them as new issues.
msg95714 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-25 10:41
Thanks, Eric. The only remaining documentation issues I'm aware of are in Doc/tutorial/floatingpoint.rst. I think Raymond is going to update this to match the py3k version. I'll call this done, then! Thanks for all your help.
msg139402 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2011-06-29 09:48
Wondered if you guys had heard of some recent advances in the state of the art in this field. I'm sure you have, but thought I'd link it here anywhere. Quote taken from this article (which links to relevant papers): http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/ In 2010, Florian Loitsch published a wonderful paper in PLDI, "Printing floating-point numbers quickly and accurately with integers", which represents the biggest step in this field in 20 years: he mostly figured out how to use machine integers to perform accurate rendering! Why do I say "mostly"? Because although Loitsch's "Grisu3" algorithm is very fast, it gives up on about 0.5% of numbers, in which case you have to fall back to Dragon4 or a derivative. If you're a language runtime author, the Grisu algorithms are a big deal: Grisu3 is about 5 times faster than the algorithm used by printf in GNU libc, for instance. A few language implementors have already taken note: Google hired Loitsch, and the Grisu family acts as the default rendering algorithms in both the V8 and Mozilla Javascript engines (replacing David Gay's 17-year-old dtoa code). Loitsch has kindly released implementations of his Grisu algorithms as a library named double-conversion.
msg139428 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-06-29 15:23
Hadn't seen that. Interesting!
msg139444 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-06-29 18:22
Thanks for the link :-)
msg139467 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-06-30 08:19
I've filed to track this last idea.
History
Date User Action Args
2022-04-11 14:56:53 admin set github: 51366
2011-06-30 11:17:01 vstinner set nosy: + vstinner
2011-06-30 08:19:33 amaury.forgeotdarc set nosy: + amaury.forgeotdarcmessages: +
2011-06-29 18:22:50 rhettinger set messages: +
2011-06-29 15:23:24 mark.dickinson set messages: +
2011-06-29 09:48:42 michael.foord set nosy: + michael.foordmessages: +
2009-11-25 10:41:11 mark.dickinson set status: open -> closedresolution: acceptedmessages: + stage: resolved
2009-11-24 22:46:41 eric.smith set messages: +
2009-11-24 21:46:20 mark.dickinson set messages: +
2009-11-23 20:56:26 mark.dickinson set messages: +
2009-11-23 18:48:26 mark.dickinson set messages: +
2009-11-19 18:44:41 mark.dickinson set messages: +
2009-11-18 19:35:41 mark.dickinson set messages: +
2009-11-03 16:15:16 mark.dickinson set files: + round_fixup.patchkeywords: + patchmessages: +
2009-10-31 09:44:47 mark.dickinson set messages: +
2009-10-29 10:17:42 mark.dickinson set messages: +
2009-10-28 08:46:29 eric.smith set messages: +
2009-10-27 19:43:22 eric.smith set messages: +
2009-10-27 18:37:04 eric.smith set messages: +
2009-10-27 12:13:34 eric.smith set messages: +
2009-10-27 11:37:44 eric.smith set messages: +
2009-10-26 22:30:02 mark.dickinson set messages: +
2009-10-26 21:19:17 mark.dickinson set messages: +
2009-10-26 17:03:15 mark.dickinson set messages: +
2009-10-26 17:01:09 mark.dickinson set messages: +
2009-10-26 16:03:25 mark.dickinson set messages: +
2009-10-25 05:56:18 rhettinger set messages: +
2009-10-24 21:17:23 eric.smith set messages: +
2009-10-24 18:29:43 eric.smith set nosy: + tim.petersmessages: +
2009-10-24 18:15:40 mark.dickinson set messages: +
2009-10-24 17:04:39 eric.smith set messages: +
2009-10-24 16:02:51 mark.dickinson set messages: +
2009-10-24 14:06:27 mark.dickinson set messages: +
2009-10-24 13:41:04 mark.dickinson set messages: +
2009-10-13 08:30:25 mark.dickinson create