Issue 28327: statistics.geometric_mean gives incorrect results for mixed int/float inputs (original) (raw)

Created on 2016-10-01 12:18 by mark.dickinson, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (6)

msg277808 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2016-10-01 12:18

The following calculations should all be giving the same result:

import statistics statistics.geometric_mean([2, 3, 5, 7]) 3.80675409583932 statistics.geometric_mean([2, 3, 5, 7.0]) 1.6265765616977859 statistics.geometric_mean([2, 3, 5.0, 7.0]) 2.4322992790977875 statistics.geometric_mean([2, 3.0, 5.0, 7.0]) 3.201085872943679 statistics.geometric_mean([2.0, 3.0, 5.0, 7.0]) 3.80675409583932

(Correct result is 3.80675409583932.)

The culprit is this line in statistics._product:

mant, scale = 1, 0  #math.frexp(prod)  # FIXME

... and indeed, we should be starting from prod rather than 1 here. But simply using math.frexp has potential for failure if the accumulated integer product overflows a float.

msg277809 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2016-10-01 12:20

Here's a fix. I was planning to add tests, but as far as I can tell geometric_mean currently has no tests at all. Steve, is that correct? That seems like something that should be fixed before the 3.6 release.

msg277814 - (view)

Author: Steven D'Aprano (steven.daprano) * (Python committer)

Date: 2016-10-01 13:22

Looks good for me.

Thanks for catching this: I knew it was a bug, but then I ran into the issue that I could no longer build 3.6 before I could fix it, and between that and various issues in the real world I never got back to this.

msg277815 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2016-10-01 13:23

New patch, with a (very slightly) cleaner implementation of _frexp_gen.

msg399948 - (view)

Author: Irit Katriel (iritkatriel) * (Python committer)

Date: 2021-08-20 09:51

I can't reproduce this now:

statistics.geometric_mean([2, 3, 5, 7]) 3.80675409583932 statistics.geometric_mean([2, 3, 5, 7.0]) 3.80675409583932 statistics.geometric_mean([2, 3, 5.0, 7.0]) 3.80675409583932 statistics.geometric_mean([2, 3.0, 5.0, 7.0]) 3.80675409583932 statistics.geometric_mean([2.0, 3.0, 5.0, 7.0]) 3.80675409583932

The current geometric_mean was added in PR12638. Is this issue about a previous version?

msg399951 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2021-08-20 10:02

Is this issue about a previous version?

Yep. Sorry for failing to close this earlier.

History

Date

User

Action

Args

2022-04-11 14:58:37

admin

set

github: 72514

2021-08-20 10:02:48

mark.dickinson

set

status: open -> closed
resolution: out of date
messages: +

stage: resolved

2021-08-20 09:51:12

iritkatriel

set

nosy: + iritkatriel
messages: +

2016-10-04 16:33:07

steven.daprano

set

versions: - Python 3.6

2016-10-01 13:23:01

mark.dickinson

set

files: + geometric_mean_int_float_v2.patch

messages: +

2016-10-01 13:22:25

steven.daprano

set

messages: +

2016-10-01 12:20:11

mark.dickinson

set

files: + geometric_mean_int_float.patch
keywords: + patch
messages: +

2016-10-01 12🔞35

mark.dickinson

create