(original) (raw)
On Tue, Oct 11, 2016 at 8:14 AM, Larry Hastings wrote:These hacks where we play games with the reference count are mostly removed in my branch.That's exactly what I would have said, because I was assuming that refcounts would be accurate. I'm not sure what you mean by "play games with",
By "playing games with reference counts", I mean code that purposely doesn't follow the rules of reference counting.� Sadly, there are special cases that apparently \*are\* special enough to break the rules.� Which made implementing "buffered reference counting" that much harder.
I currently know of two examples of this in CPython.� In both instances, an object has a reference to another object, but \*deliberately\* does not increase the reference count of the object, in order to prevent keeping the other object alive.� The implementation relies on the GIL to preserve correctness; without a GIL, it was much harder to ensure this code was correct.� (And I'm still not 100% I've done it.� More thinking needed.)
Those two examples are:
- PyWeakReference objects.� The wr\_object pointer--the
"reference" held by the weak reference object--points to an
object, but does not increment the reference count.� Worse yet,
as already observed, PyWeakref\_GetObject() and
PyWeakref\_GET\_OBJECT() don't increment the reference count, an
inconvenient API decision from my perspective.
- "Interned mortal" strings.� When a string is both interned
\*and\* mortal, it's stored in the static "interned" dict in
unicodeobject.c--as both key and value--and then its's DECREF'd
twice so those two references don't count.� When the string is
destroyed, unicode\_dealloc resurrects the string, reinstating
those two references, then removes it from the "interned" dict,
then destroys the string as normal.
To support these, I've implemented what is effectively a
secondary, atomic-only reference count.� It seems to work.� (And
yes that means all objects are now 8 bytes bigger.� Let me worry
about memory consumption later, m'kay?)
Resurrecting object also gave me a headache in the Gilectomy with this buffered reference counting scheme, but I think I have that figured out too.� When you resurrect an object, it's generally because you're going to expose it to other subsystems that may incr / decr / otherwise inspect the reference count.� Which means that code may buffer reference count changes.� Which means you can't immediately destroy the object anymore.� So: when you resurrect, you set the new reference count, you also set a flag saying "I've already been resurrected", you pass it in to that other code, you then drop your references with Py\_DECREF, and you exit.� Your dealloc function will get called again later; you then see you've already done that first resurrection, and you destroy as normal.� Curiously enough, the typeobject actually needs to do this twice: once for tp\_finalize, once for tp\_del.� (Assuming I didn't completely misunderstand what the code was doing.)
My struggles continue,
/arry