[Python-Dev] Issue 10194 - Adding a gc.remap() function (original) (raw)
Peter Ingebretson pingebre at yahoo.com
Tue Oct 26 19:11:00 CEST 2010
- Previous message: [Python-Dev] Issue 10194 - Adding a gc.remap() function
- Next message: [Python-Dev] Issue 10194 - Adding a gc.remap() function
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
--- On Tue, 10/26/10, Hrvoje Niksic <hrvoje.niksic at avl.com> wrote:
What about objects that don't implement tptraverse because they cannot take part in cycles?
A significant majority of objects that can hold references to other objects can take part in cycles and do implement tp_traverse. My original thought was that modifying any references not visible to the cyclic GC would be out of the scope of gc.remap.
Even adding a 'tp_extended_traverse' method might not help solve this problem because untracked objects are not in any generation list, so there is no general way to find all of them.
Changing immutable objects such as tuples and frozensets doesn't exactly sound appealing.
My original Python-only approach cloned immutable objects that referenced objects that were to be remapped, and then added the old and new immutable object to the mapping. This worked well, although it was somewhat complicated because it had to happen in dependency order (e.g., to handle tuples of tuples in frozensets).
I thought about keeping this, but I am now convinced that as long as you are doing something as drastic as changing references in the heap you may as well change immutable objects.
The main argument is that preserving immutable objects increases the
complexity of remapping and does not actually solve many problems.
The primary reason for objects to be immutable is so that their
comparison operators and hash value can remain consistent. Changing,
for example, the contents of a tuple that a dictionary key references
has the same effect as changing the identity of the tuple -- both
modify the hash value of the key and thus invalidate the dictionary.
The full reload processs needs to rehash collections invalidated by
hash values changing, so we might as well modify the contents of tuples.
> the signature of visitproc has been modified to take (PyObject **) > instead of (PyObject *) so that a visitor can modify fields > visited with PyVISIT.
This sounds like a bad idea -- visitproc is not limited to visiting struct members. Visited objects can be stored in data structures where their address cannot be directly obtained. If you want to go this route, rather create an extended visit procedure (visitchangeproc?) that accepts a function that can change the reference. A convenience function or macro could implement this for the common case of struct member or PyObject**.
This is a compelling argument. I considered adding an extended traverse / visit path, but decided against it after not finding any cases in the base distribution that required it. The disadvantage of creating an additional method is that C types will have yet another method to implement for the gc (tp_traverse, tp_clear, and now tp_traverse_modify(?)). On the other hand, you've convinced me that this is necessary in some cases, so it might as well be used in all of them. Jon Parise also pointed out in a private communication that this eliminates the minor performance impact on tp_traverse, which is another advantage over my change.
If a 'tp_traverse_modify' function were added, many types could replace their custom tp_clear function with a generic method that makes use of (visitchangeproc), which somewhat mitigates adding another method.
- Previous message: [Python-Dev] Issue 10194 - Adding a gc.remap() function
- Next message: [Python-Dev] Issue 10194 - Adding a gc.remap() function
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]