msg157726 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-07 10:55 |
I'm not sure __sizeof__ is implemented correctly: >>> from decimal import Decimal >>> import sys >>> d = Decimal(123456789123456798123456789123456798123456789123456798) >>> d Decimal('123456789123456798123456789123456798123456789123456798') >>> sys.getsizeof(d) 24 ... looks too small. |
|
|
msg157729 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2012-04-07 11:29 |
It isn't implemented at all. The Python version also always returns 96, irrespective of the coefficient length. Well, arguably the coefficient is a separate object in the Python version: 96 >>> sys.getsizeof(d._int) 212 For the C version I'll do the same as in longobject.c. |
|
|
msg157730 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2012-04-07 11:32 |
In full: >>> d = Decimal(100000000000000000000000000000000000000000000000000000000000000000000) >>> sys.getsizeof(d) 96 >>> sys.getsizeof(d._int) 212 |
|
|
msg157798 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-04-08 17:30 |
There are really two options: a) if an object is a container, and the contained is accessible to reflection (preferably through gc.get_referents), then the container shouldn't account for the size of the contained. b) if the contained is not accessible (except for sys.get_objects() in a debug build), then the container should provide the total sum. A memory debugger is supposed to find all objects (e.g. through gc.get_objects, and gc.get_referents), eliminate duplicate references, and then apply sys.getsizeof for each object. This should then not leave out any memory, and not count any memory twice. |
|
|
msg157857 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-04-09 16:14 |
In the C version of decimal, do distinct Decimal objects ever share coefficients? (This would be an obvious optimization for methods like Decimal.copy_negate; I don't know whether the C version applies such optimizations.) If there's potential for shared coefficients, that might make the "not count any memory twice" part tricky. |
|
|
msg157860 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-04-09 16:30 |
> In the C version of decimal, do distinct Decimal objects ever share > coefficients? (This would be an obvious optimization for methods > like Decimal.copy_negate; I don't know whether the C version > applies such optimizations.) If there's potential for shared > coefficients, that might make the "not count any memory twice" part > tricky. I know of three strategies to deal with such a case: a) expose the inner objects, preferably through tp_traverse, and don't account for them in the container, b) find a "canonical" owner of the contained objects, and only account for them along with the canonical container. c) compute the number N of shared owners, and divide the object size by N. Due to rounding, this may be somewhat incorrect. |
|
|
msg157861 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2012-04-09 16:54 |
Mark Dickinson <report@bugs.python.org> wrote: > In the C version of decimal, do distinct Decimal objects ever share coefficients? The coefficients are members of the mpd_t struct (libmpdec data type), and they are not exposed as Python objects or shared. Cache locality is incredibly important: I have a patch that reserves a static coefficient of four words inside the PyDecObject. This patch speeds up _decimal by roughly another 30-40% for regularly sized decimals. If the decimal grows beyond that, libmpdec automatically switches to a dynamically allocated coefficient. I think sharing would probably slow things down a bit. |
|
|
msg157864 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-04-09 17:14 |
> and they are not exposed as Python objects or shared. Okay, thanks. Sounds like this isn't an issue at the moment then. +1 for having getsizeof report the total size used. |
|
|
msg157886 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-04-09 19:33 |
New changeset 010aa5d955ac by Stefan Krah in branch 'default': Issue #14520: Add __sizeof__() method to the Decimal object. http://hg.python.org/cpython/rev/010aa5d955ac |
|
|
msg157889 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2012-04-09 19:51 |
Thanks for the explanations. The new __sizeof__() method should now report the exact memory usage. |
|
|