Issue 36493: Add math.midpoint(a,b) function (original) (raw)

Created on 2019-03-31 13:10 by scoder, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (7)
msg339257 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-03-31 13:10
I recently read a paper¹ about the difficulty of calculating the most exact midpoint between two floating point values, facing overflow, underflow and rounding errors. The intention is to assure that the midpoint(a,b) is - actually within the interval [a,b] - the floating point number in that interval that is closest to the real midpoint (i.e. (a+b)/2). It feels like a function that should be part of Python's math module. ¹ https://hal.archives-ouvertes.fr/file/index/docid/576641/filename/computing-midpoint.pdf The author proposes the following implementation (pages 20/21): midpoint(a,b) = a if a == b else # covers midpoint(inf, inf) 0 if a == -b else # covers midpoint(-inf, +inf) -realmax if a == -inf else # min. double value +realmax if b == +inf else # max. double value round_to_nearest_even((a - a/2) + b/2) I guess nans should simply pass through as in other functions, i.e. midpoint(a,nan) == midpoint(nan,b) == nan. The behaviour for [a,inf] is decidedly up for discussion. There are certainly cases where midpoint(a,+inf) would best return +inf, but +realmax as an actual finite value also seems reasonable. OTOH, it's easy for users to promote inf->realmax or realmax->inf or even inf->a themselves, just like it's easy to check for +/-inf before calling the function. It just takes a bit longer to do these checks on user side. There could also be a "mode" argument that makes it return one of: a or b (i.e. the finite bound), +/-realmax or +/-inf in the two half-infinity cases. What do you think about this addition?
msg339275 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2019-03-31 20:22
What's the argument for midpoint(inf, -inf) == 0? This feels wrong to me. I'm certainly not an expert on math, but I still remember that there are different kinds of infinities. For examples 'n**n' approaches infinity 'faster' than '-(2**n)'.
msg339281 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-04-01 07:03
Yes, I'd definitely expect `midpoint(inf, -inf)` to be `nan`.
msg339282 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-04-01 07:12
Special cases aside, I think this is a YAGNI. The "obvious" formulas `(a + b)/2)` and `0.5 * (a + b)` _do_ do exactly the right thing (including giving a perfectly correctly-rounded answer with round-ties-to-even on a typical IEEE 754-using machine) provided that subnormals and values very close to the upper limit are avoided. If you're doing floating-point arithmetic with values of size > 1e300, you've probably already got significant issues. I could see specialist uses for this, e.g., in a general purpose bisection algorithm, but I'm not convinced it's worth adding something to the math library just for that.
msg339285 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2019-04-01 08:02
I'm inclined to agree with Mark - the wholly naive algorithm is already "perfect" away from extreme (whether large or small) magnitudes, and can't be beat for speed either. This kind of thing gets much more attractive in IEEE single precision, where getting near the extremes can be common in some apps. That said, at a higher level there's no end to FP functions that _could_ be implemented to be more accurate in some (even many) cases. That it's possible isn't enough, on its own, to justify all the eternal costs of supplying it. Significant real life use cases make for far more compelling justification. For new FP stuff, the first thing I ask is "does numpy/scipy/mpmath supply it?". I don't know in this case, but if the answer is "no", then it's a pretty safe bet that number-crunching Python users have insignificant real use for it. If the answer is "yes", then the question changes to whether Pythoneers other than those _already_ using the major number-crunching Python packages have significant real uses for it. On that basis, e.g., accepting the whole `statistics` module was an easy call (many real uses outside of just specialists), but adding a function to do LU decomposition of a matrix is an easy reject (exceedingly useful, in fact, but those who need it almost certainly already use the Python packages that already supply it).
msg339304 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-01 19:41
I buy the YAGNI argument and won't fight for this. Thanks for your input.
msg339349 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-04-02 16:08
I see this has been rejected for the math module, but I wonder whether it should be considered for Decimal? I recall Mark Dickinson demonstrating this lovely little bit of behaviour: py> from decimal import getcontext, Decimal py> getcontext().prec = 3 py> x = Decimal('0.516') py> y = Decimal('0.518') py> (x + y) / 2 Decimal('0.515') To prove its not a fluke, or an artifact of tiny numbers: py> getcontext().prec = 28 py> a = Decimal('0.10000000000000000009267827205') py> b = Decimal('0.10000000000000000009267827207') py> a <= (a + b)/2 <= b False py> a = Decimal('710000000000000000009267827205') py> b = Decimal('710000000000000000009267827207') py> a <= (a + b)/2 <= b False Thoughts?
History
Date User Action Args
2022-04-11 14:59:13 admin set github: 80674
2019-04-02 16:08:42 steven.daprano set nosy: + steven.dapranomessages: +
2019-04-01 19:41:14 scoder set status: open -> closedresolution: rejectedmessages: + stage: resolved
2019-04-01 08:02:44 tim.peters set messages: +
2019-04-01 07:13:32 mark.dickinson set nosy: + tim.peters
2019-04-01 07:12:31 mark.dickinson set messages: +
2019-04-01 07:03:08 mark.dickinson set messages: +
2019-03-31 20:22:44 christian.heimes set nosy: + christian.heimesmessages: +
2019-03-31 13:10:06 scoder create