Issue 5377: Strange behavior when performing int on a Decimal made from -sys.maxint-1 (original) (raw)

Issue5377

Created on 2009-02-26 20:24 by debedb, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
negmaxintbug.py debedb,2009-02-26 20:24 Test Case
force_int-4.patch vstinner,2009-06-08 23:42
Messages (21)
msg82773 - (view) Author: Gregory Golberg (debedb) Date: 2009-02-26 20:24
On some Python builds (2.5.2 and 2.6.1) the following program: import sys from decimal import Decimal def show(n): print type(n) d = Decimal(str(n)) i = int(d) t = type(i) print t i2 = int(i) t2 = type(i2) print t2 n = - sys.maxint - 1 show(n) prints <type 'int'> <type 'long'> <type 'int'> While on 2.4 and 2.5.1 it prints: <type 'int'> <type 'int'> <type 'int'> This seems to happen only with -sys.maxint-1 number! This has been tested with the following builds: *** "Strange" result (with long): *** 2.6.1 (r261:67515, Feb 26 2009, 12:21:28) [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] 2.5.2 and 2.6.1 on Windows Server 2003 *** "Expected" result (all int): *** 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] 2.5.1 (r251:54863, Oct 15 2007, 13:50:22) [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] 2.4.5 (#2, Aug 1 2008, 02:20:59) [GCC 4.3.1] 2.4.5 (#1, Jul 22 2008, 08:30:02) [GCC 3.4.3 (csl-sol210-3_4-20050802)] 2.4.3 (#1, Sep 21 2007, 20:05:43) [GCC 3.4.6 20060404 (Red Hat 3.4.6-8)]
msg82787 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-26 23:32
For a Decimal object (d), int(d) calls d.__int__(). In your example, d has the attributes: * _sign=1 (negative) * _exp=0 (10^0=1) * _int='2147483648' d.__int__() uses s*int(self._int)*10**self._exp <=> -(int('2147483648')). Since int('2147483648') creates a long, you finally get a long instead of an integer. Workaround to get a small integer even with -2147483648: int(int(d)) ;-) For me, it's not a bug because __int__() can return a long! The following code works in Python 2.5 and 2.6: class A: def __int__(self): return 10**20
msg82788 - (view) Author: Gregory Golberg (debedb) Date: 2009-02-26 23:38
Well, yes, the workaround works, but the question is why would the second int() return an int, if it's indeed a long? And why the difference in this behavior between 2.5.1 and 2.5.2.
msg82800 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-27 01:09
> the question is why would the second int() return an int, > if it's indeed a long? Python doesn't convert long to int even if the long can fit in an int. Example: >>> type(1) <type 'int'> >>> type(1L) <type 'long'> >>> type(1L+1) <type 'long'> >>> type(2) <type 'int'> Even if 1L and 2L can fit in a int, Python keeps the long type. > why the difference in this behavior between 2.5.1 and 2.5.2 No idea. You can simplify your test script with : # example with python 2.5.1 (32 bits CPU) >>> type(-int('2147483648')) <type 'long'> >>> sys.maxint On a 64 bits CPU, sys.maxint is much bigger, so don't have the problem with -2147483648 but with -9223372036854775808: # example with python 2.5.2 (*64 bits CPU*) >>> sys.maxint + 1 9223372036854775808L >>> -int('9223372036854775808') -9223372036854775808L >>> int(-int('9223372036854775808')) -9223372036854775808
msg82803 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-27 01:13
Anyway, the behaviour is correct. But ok, it's "strange" because unexpected. You have to understand the fact the long=>int conversion is manual :-/ Decimal.__int__ might force return int(result) at the end to avoid problem with -sys.maxint, but is it really important? I don't think so. Python3 doesn't have this problem ;-)
msg82829 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-02-27 11:38
Why do you care whether the result is an int or a long in this case? Does it affect any code that you know of in a meaningful way? > And why the difference in this behavior between 2.5.1 and 2.5.2. There were some fairly major changes (many bugfixes, new functions to comply with an updated specification, for example, pow, log and log10) to the decimal module between 2.5 and 2.6, and the majority of those changes were also backported to 2.5.2. This particular change was part of a set of changes that changed the internal representation of the coefficient of a Decimal instance from a tuple to a string, for speed reasons. See r59144. As Victor says, this is trivial to fix; I'm not convinced that it's actually worth fixing, though. In Python 2.5, the difference between ints and longs should be almost invisible anyway. It's nice (for performance reasons) if small integers are represented as ints rather than longs. Since this one's only just a small integer, it's difficult to care much. :-)
msg82830 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-02-27 11:50
For anyone who does care about this, it should be noted that the Fraction type has similar issues. The following comes from Python 2.7 on a 64-bit machine: >>> int(Fraction(2**63-1)) 9223372036854775807L >>> int(2**63-1) 9223372036854775807
msg82869 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2009-02-27 20:47
Unless there is a discrepancy between doc and behavior, this strikes me as an unspecified implementation detail. If so, it should be either closed or changed to a specific feature request.
msg82911 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-28 14:42
@tjreedy: Do you expect conversion to small int if __int__() result fits in a small int? ---- class A: def __int__(self): return 1L x=int(A()) print repr(x), type(x) ---- Result with Python 2.5.1: 1L <type 'long'>
msg82913 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-02-28 14:55
The behaviour doesn't contradict the documentation, as far as I can tell, so I agree with Terry that this is not a bug. If we want the result from the built-in int function to have type int whenever possible (that is, whenever the result is in the closed interval [-sys.maxint-1, sys.maxint], it doesn't seem right that the burden for ensuring this should lie with individual __int__ methods: instead, the general machinery for implementing the built-in int function should check any result of type long to see if it fits in an int, and convert if so. Is this desirable?
msg84231 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-26 23:36
> The general machinery for implementing the built-in int function > should check any result of type long to see if it fits in an int, > and convert if so. Attached patch try to convert long to int, and so it fix the intial problem: assert isinstance(int(Decimal(-sys.maxint-1), int). I used benchmark tools dedicated to test integers: Unpatched: pidigit.py: 4612.0 ms bench_int.py: 2743.5 ms Patched: pidigit.py: 4623.8 ms (0.26% slower) bench_int.py: 2754.5 ms (0.40% slower) So for intensive integer operations, the overhead is low. Using a more generic benchmark tool (pybench?), you might not be able to see the difference ;-) I'm +0 for this patch because it fixes a very rare case: 1 case on (sys.maxint + 1) × 2 0.00000002% with maxint=2^31
msg84233 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-26 23:39
I added the two benchmark tools to my own public SVN: http://haypo.hachoir.org/trac/browser/misc/bench_int.py (improved version of the script attached to issue #4294) http://haypo.hachoir.org/trac/browser/misc/pidigits.py (improved version of the script attached to issue #5512) If you know a better place to these benchmarks, feel free to reupload them somewhere else.
msg84297 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-03-28 04:22
Thanks for the patch, Victor. I think this is the right thing to do, though I'm still not sure why anyone would care about getting longs instead of ints back from int(x). Comments and questions: (0) Please could you add some tests! (1) Shouldn't the first line you added include a check for res == NULL? (2) It looks as though the patched code ends up calling PyLong_Check twice when __int__ returns a long. Can you find a clear rewrite that avoids this duplication? By the way, I realized after posting my last comment that the issue with Fraction has nothing to do with extreme int values. For example, with the current trunk (not including Victor's patch): >>> int(Fraction(2L)) 2L >>> int(int(Fraction(2L))) 2 I don't think should be considered a bug in Fraction---I think Victor's solution of making the int() machinery always return int when possible is the right one here. The need to call int(int(x)) if you *really* want an int seems a little ugly.
msg84376 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-29 11:38
> I'm still not sure why anyone would care about getting longs > instead of ints back from int(x) It's strange that sometimes we need to write int(int(obj)) to get an integer :-/ I usually use int(x) to convert x to an integer (type 'int' and not 'long'). > (0) Please could you add some tests! done > (1) Shouldn't the first line you added include a check > for res == NULL? segfault... ooops :-) fixed > (2) It looks as though the patched code ends up calling > PyLong_Check twice (...) done See updated patch.
msg84377 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-29 11:42
(oops, my patch v2 includes an unrelated change)
msg84424 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-03-29 18:48
Thanks, Victor A couple of things: - I'm getting a test failure in test_class - you should probably be using sys.maxint rather than sys.maxsize: the two aren't necessarily the same. (E.g., on 64-bit windows, I believe that sys.maxint is 2**31-1 while sys.maxsize is 2**63-1). - This still doesn't fix the case of int(Fraction(2L)).
msg89125 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-06-08 23:42
> Thanks, Victor You're welcome :-) > - I'm getting a test failure in test_class fixed > - you should probably be using sys.maxint rather than sys.maxsize done > This still doesn't fix the case of int(Fraction(2L)) fixed: Fraction uses __trunc__ rather than __int__. See updated patch: force_int-4.patch
msg93866 - (view) Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) * Date: 2009-10-11 18:04
PyPy is a bit of a special case, because it cares about the distinction of int and long in the translation toolchain. Nevertheless, this behavior has been annoying to us.
msg93867 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-11 18:07
Carl, thanks for that. I was just thinking about abandoning this issue as not worth fixing. I need to look at Victor's patch again, but I recall that there were still some issues: e.g., if the __int__ method of some class returns a bool, that still ends up getting returned as a bool rather than an int. Getting everything exactly right seemed fiddly enough to make it not worth the effort. Would the bool/int distinction matter to PyPy?
msg93869 - (view) Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) * Date: 2009-10-11 18:13
[...] > Would the bool/int distinction matter to PyPy? No, it's really mostly about longs and ints, because RPython does not have automatic overflowing of ints to longs (the goal is really to translate ints them to C longs with normal C overflow behaviour). I would understand if you decide for wontfix, because you are not supposed to care about int/long and as I said, PyPy is a special case. Thanks, Carl Friedrich
msg103026 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-13 09:19
Closing: it's too late for Python 2.x.
History
Date User Action Args
2022-04-11 14:56:46 admin set github: 49627
2010-04-13 09:19:52 mark.dickinson set status: open -> closedresolution: out of datemessages: +
2010-01-10 13:27:30 mark.dickinson set priority: low -> normal
2009-10-11 18:13:09 Carl.Friedrich.Bolz set messages: + title: Strange behavior when performing int on a Decimal made from -sys.maxint-1 -> Strange behavior when performing int on a Decimal made from -sys.maxint-1
2009-10-11 18:08:00 mark.dickinson set messages: +
2009-10-11 18:04:28 Carl.Friedrich.Bolz set nosy: + Carl.Friedrich.Bolzmessages: +
2009-06-08 23:42:23 vstinner set files: - force_int-3.patch
2009-06-08 23:42:16 vstinner set files: + force_int-4.patchmessages: +
2009-03-29 18:48:58 mark.dickinson set messages: +
2009-03-29 11:42:19 vstinner set files: - force_int.patch
2009-03-29 11:42:14 vstinner set files: + force_int-3.patchmessages: +
2009-03-29 11:39:17 vstinner set files: - force_int-2.patch
2009-03-29 11:38:42 vstinner set files: + force_int-2.patchmessages: +
2009-03-28 12:22:59 mark.dickinson set components: + Interpreter Core, - Library (Lib)
2009-03-28 12:22:48 mark.dickinson set priority: lowassignee: mark.dickinson
2009-03-28 04:22:05 mark.dickinson set messages: + stage: test needed
2009-03-26 23:39:11 vstinner set messages: +
2009-03-26 23:36:31 vstinner set files: + force_int.patchkeywords: + patchmessages: +
2009-02-28 14:56:34 mark.dickinson set type: behavior -> enhancement
2009-02-28 14:56:00 mark.dickinson set messages: +
2009-02-28 14:42:01 vstinner set messages: +
2009-02-27 20:47:27 terry.reedy set nosy: + terry.reedymessages: +
2009-02-27 11:50:57 mark.dickinson set messages: +
2009-02-27 11:38:27 mark.dickinson set nosy: + rhettinger, mark.dickinsonmessages: + components: + Library (Lib), - Interpreter Coreversions: + Python 2.7, - Python 2.5
2009-02-27 07:15:51 theller set assignee: theller -> (no value)
2009-02-27 07:15:34 theller set nosy: - thellercomponents: - ctypes
2009-02-27 01:13:39 vstinner set messages: +
2009-02-27 01:13:22 vstinner set messages: -
2009-02-27 01:12:47 vstinner set messages: +
2009-02-27 01:09:36 vstinner set messages: +
2009-02-26 23:38:39 debedb set messages: +
2009-02-26 23:32:55 vstinner set nosy: + vstinnermessages: +
2009-02-26 20:24:27 debedb create