bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag by vstinner · Pull Request #25721 · python/cpython (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation31 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow
creating type instances: set tp_new to NULL and don't create the
"new" key in the type dictionary.
The flag is set automatically on static types if tp_base is NULL or
&PyBaseObject_Type and tp_new is NULL.
Use the flag on the following types:
- _curses.ncurses_version type
- _curses_panel.panel
- _tkinter.Tcl_Obj
- _tkinter.tkapp
- _tkinter.tktimertoken
- _xxsubinterpretersmodule.ChannelID
- sys.flags type
- sys.getwindowsversion() type
- sys.version_info type
Update MyStr example in the C API documentation to use
Py_TPFLAGS_DISALLOW_INSTANTIATION.
Add _PyStructSequence_InitType() function to create a structseq type
with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.
https://bugs.python.org/issue43916
This flag can be used outside CPython code base. For example, the pyrsistent project sets tp_new to NULL:
static PyObject* pyrsistent_pvectorc_moduleinit(void)
{
...
// Only allow creation/initialization through factory method pvec
PVectorType.tp_init = NULL;
PVectorType.tp_new = NULL;
if (PyType_Ready(&PVectorType) < 0) {
return NULL;
}
...
Py_INCREF(&PVectorType);
PyModule_AddObject(m, "PVector", (PyObject *)&PVectorType);
return m;
}
Nice! Do you want to update _hashopenssl.c
or should I update it later?
Nice! Do you want to update _hashopenssl.c or should I update it later?
I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.
Nice! Do you want to update _hashopenssl.c or should I update it later?
I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.
I can adapt #25653 to use this flag.
Nice! Do you want to update _hashopenssl.c or should I update it later?
I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.
Sounds good to me! I have created PR draft #25722.
About the flag name, another name could be Py_TPFLAGS_DISALLOW_NEW. Or a different name.
About the complexity of tp_new inheritance, here is an extract of tp_new_wrapper():
/* Check that the use doesn't do something silly and unsafe like
object.__new__(dict). To do this, we check that the
most derived base that's not a heap type is this type. */
staticbase = subtype;
while (staticbase && (staticbase->tp_new == slot_tp_new))
staticbase = staticbase->tp_base;
/* If staticbase is NULL now, it is a really weird type.
In the spirit of backwards compatibility (?), just shut up. */
if (staticbase && staticbase->tp_new != type->tp_new) {
PyErr_Format(PyExc_TypeError,
"%s.__new__(%s) is not safe, use %s.__new__()",
type->tp_name,
subtype->tp_name,
staticbase->tp_name);
return NULL;
}
Naming is hard. Both Py_TPFLAGS_DISABLE_NEW and Py_TPFLAGS_DISALLOW_NEW reflect how Python prevents creation instances, not what the flag actually accomplishes. How about something like Py_TPFLAGS_DISALLOW_INSTANTIATION?
I've adapted #25653 to use this flag. Regarding naming, I like @tiran's suggestion. I'll do a force push as soon as the name has landed :)
vstinner changed the title
bpo-43916: Add Py_TPFLAGS_DISABLE_NEW type flag bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag
I renamed the flag to Py_TPFLAGS_DISALLOW_INSTANTIATION. I also updated _xxsubinterpretersmodule.ChannelID to use the new flag.
"The creation of an instance is called instantiation." says Wikipedia: https://en.wikipedia.org/wiki/Instance_(computer_science)
I compared the usage of "disallow" versus "deny" in Include/ header files and in Doc/ documentation: "disallow" is more popular.
@tiran @erlend-aasland: my PR uses the flag in 9 types (and 1 documentation example). @erlend-aasland's PR disallow instanciation in many types. Do you think that these usages are common enough to justify to add a type flag?
This design doesn't prevent calling directly tp_new in C. I clarified the flag documentation:
/* Disallow creating instances of the type in Python. It remains possible to
* call directly the tp_new function in C. */
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a much more elegant solution, the only thing I am slightly wary of is to expose this publicly.
There is already a standard way to prevent creating a new instance: setting tp_new to NULL. No new type flag is needed. BTW, tp_new == NULL
is one of checks for pickleability of the type.
There is already a standard way to prevent creating a new instance: setting tp_new to NULL. No new type flag is needed. BTW, tp_new == NULL is one of checks for pickleability of the type.
If tp_new is set NULL after the type is created, creating an instance in Python fails with a weird error message. Example in Python 3.9:
$ python3.9
Python 3.9.4 (default, Apr 6 2021, 00:00:00)
>>> import _curses_panel
# Good
>>> _curses_panel.panel()
TypeError: cannot create '_curses_panel.panel' instances
# Weird error
>>> _curses_panel.panel.__new__(_curses_panel.panel)
TypeError: object.__new__(_curses_panel.panel) is not safe, use _curses_panel.panel.__new__()
>>> _curses_panel.panel.__new__ is object.__new__
True
The problem is that the the type __dict__['__new__']
exists (wrapper to object.new). I would prefer to request in the type spec that instantiation is disallowed. Copy of my previous comment, extract of tp_new_wrapper():
/* Check that the use doesn't do something silly and unsafe like
object.__new__(dict). To do this, we check that the
most derived base that's not a heap type is this type. */
staticbase = subtype;
while (staticbase && (staticbase->tp_new == slot_tp_new))
staticbase = staticbase->tp_base;
/* If staticbase is NULL now, it is a really weird type.
In the spirit of backwards compatibility (?), just shut up. */
if (staticbase && staticbase->tp_new != type->tp_new) {
PyErr_Format(PyExc_TypeError,
"%s.__new__(%s) is not safe, use %s.__new__()",
type->tp_name,
subtype->tp_name,
staticbase->tp_name);
return NULL;
}
Extract of inherit_special():
/* The condition below could use some explanation.
It appears that tp_new is not inherited for static types
whose base class is 'object'; this seems to be a precaution
so that old extension types don't suddenly become
callable (object.__new__ wouldn't insure the invariants
that the extension type's own factory function ensures).
Heap types, of course, are under our control, so they do
inherit tp_new; static extension types that specify some
other built-in type as the default also
inherit object.__new__. */
if (base != &PyBaseObject_Type ||
(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) {
if (type->tp_new == NULL)
type->tp_new = base->tp_new;
}
PyType_FromSpec() creates a type and then calls PyType_Ready() on it. PyType_Ready() calls inherit_special() which inherits tp_new (since it's a heap type), and then creates tp_dict["__new__"]
since tp_new is not NULL anymore.
@serhiy-storchaka: I updated the PR. The flag now explicitly sets tp_new to NULL. So code checking if tp_new is NULL (like pickle) is not affected.
Advantages of using the new flag compared to setting tp_new to NULL:
- PyType_FromSpec() doesn't support tp_new=NULL and it would be tricky to support that.
- It ensures that
__new__
is not created in the type dictionary. In Python,my_type.__new__
attribute doesn't exist. - The same flag can be used on static types, heap types and with _PyStructSequence_InitType().
- The flag is known when the type is created and so PyType_Ready() can handle corner cases. For example, it doesn't create
__new__
key in the type dictionary.
I mark this PR as a draft. I'm trying to modify PyType_FromSpec() to support {Py_tp_new, NULL},
slot, so we can compare the three options (Erlend's PR, this PR, and the future PyType_FromSpec PR).
Ok, there are now 3 API options to solve https://bugs.python.org/issue43916:
- A (PR bpo-43916: Add _PyType_DisabledNew to prevent new heap types from being created uninitialised #25653) sets tp_new to a new _PyType_DisabledNew() function
- B (this PR bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag #25721) adds a public Py_TPFLAGS_DISALLOW_INSTANTIATION flag
- C (PR [WIP] bpo-43916: PyType_FromSpec() supports Py_tp_new=NULL slot #25733) modifies PyType_FromSpec() to support
Py_tp_new=NULL
slot
There are 3 main ways to create types
- Static types
- A uses
tp_new = _PyType_DisabledNew
- B sets Py_TPFLAGS_DISALLOW_INSTANTIATION flag
- C uses regular
tp_new = NULL
- A uses
- Heap types: PyType_Spec struct + PyFrom_FromSpec()
- A uses
{Py_tp_new, _PyType_DisabledNew}
slot - B sets Py_TPFLAGS_DISALLOW_INSTANTIATION flag
- C uses a new
{Py_tp_new, NULL}
slot
- A uses
- structseq type: PyStructSequence_InitType2() or a new _PyStructSequence_InitType() function
- A doesn't address this case (keep the current code which sets tp_new and remove the
__new__
key afterwards) - B sets Py_TPFLAGS_DISALLOW_INSTANTIATION flag
- C uses a new internal function to pass disallow_instantiation=1
- A doesn't address this case (keep the current code which sets tp_new and remove the
A drawbacks:
- _PyType_DisabledNew() is only exposed in the internal C API and so needs to build more C extensions with the internal C API.
- Don't slove the structseq types case.
B drawbacks:
- Add a new public type flag.
C drawbacks:
{Py_tp_new, NULL}
is less explicit than_PyType_DisabledNew()
andPy_TPFLAGS_DISALLOW_INSTANTIATION
.- New internal API are more specific to this problem than B internal functions.
vstinner marked this pull request as ready for review
My preference is B, even if we can manually set tp_new
to NULL
it has a bit of a "hacky" feeling. I find the flag the most simple and elegant of all, but is true that still feels just a bit weird that a flag will disallow instantiation.
My preference order is B-> C -> A
My preference order is B-> C -> A
Mine too, FWIW.
My preference is B, even if we can manually set tp_new to NULL it has a bit of a "hacky" feeling. I find the flag the most simple and elegant of all, but is true that still feels just a bit weird that a flag will disallow instantiation.
In Python, there is a similar "hack" to mark a type as non hashable:
$ python3
Python 3.9.4 (default, Apr 6 2021, 00:00:00)
>>> class A:
... pass
>>> hash(A())
8731944244974
>>> class B:
... __hash__ = None
>>> hash(B())
TypeError: unhashable type: 'B'
In Python, setting __new__
to None
prevents instantiation, but with a surprising error message ;-)
$ python3
Python 3.9.4 (default, Apr 6 2021, 00:00:00)
>>> class A:
... __new__ = None
>>> A()
TypeError: 'NoneType' object is not callable
In Python, setting
__new__
toNone
prevents instantiation, but with a surprising error message ;-)
I wouldn't call that surprising TBH :)
Another slice of type inheritance (tp_new) complexity. Extract of update_one_slot() which is called by type_new() after PyType_Ready():
else if (Py_IS_TYPE(descr, &PyCFunction_Type) &&
PyCFunction_GET_FUNCTION(descr) ==
(PyCFunction)(void(*)(void))tp_new_wrapper &&
ptr == (void**)&type->tp_new)
{
/* The __new__ wrapper is not a wrapper descriptor,
so must be special-cased differently.
If we don't do this, creating an instance will
always use slot_tp_new which will look up
__new__ in the MRO which will call tp_new_wrapper
which will look through the base classes looking
for a static base and call its tp_new (usually
PyType_GenericNew), after performing various
sanity checks and constructing a new argument
list. Cut all that nonsense short -- this speeds
up instance creation tremendously. */
specific = (void *)type->tp_new;
/* XXX I'm not 100% sure that there isn't a hole
in this reasoning that requires additional
sanity checks. I'll buy the first person to
point out a bug in this reasoning a beer. */
}
Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow creating type instances: set tp_new to NULL and don't create the "new" key in the type dictionary.
The flag is set automatically on static types if tp_base is NULL or &PyBaseObject_Type and tp_new is NULL.
Use the flag on the following types:
- _curses.ncurses_version type
- _curses_panel.panel
- _tkinter.Tcl_Obj
- _tkinter.tkapp
- _tkinter.tktimertoken
- _xxsubinterpretersmodule.ChannelID
- sys.flags type
- sys.getwindowsversion() type
- sys.version_info type
Update MyStr example in the C API documentation to use Py_TPFLAGS_DISALLOW_INSTANTIATION.
Add _PyStructSequence_InitType() function to create a structseq type with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.
type_new() calls _PyType_CheckConsistency() at exit.
PR rebased to fix a conflict.
I took @pablogsal and @serhiy-storchaka concerns in account. Pablo would prefer to avoid adding a public type, Serhiy would prefer to modify PyType_FromSpec() to accept tp_new=NULL. I implemented Serhiy's idea to compare the API with this Py_TPFLAGS_DISALLOW_INSTANTIATION PR. Most people prefer the explicit Py_TPFLAGS_DISALLOW_INSTANTIATION flag which is less surprising than tp_new=NULL. I also listed other advantages of the Py_TPFLAGS_DISALLOW_INSTANTIATION flag in my previous comments.
Usually, I would prefer to get a clear consensus, but we are out of time: Python 3.10 beta1 (feature freeze) is next Wednesday, and I will be in holiday for 1 week starting tonight. If there is still a disagreement, we can still revisit the API or the implementation before Python 3.10 final release ("3.10.0 final: Monday, 2021-10-04" says PEP 619).
I merged my PR to unblock https://bugs.python.org/issue43916 release blocker issue (and there are other open release blocker issues to be solved before Wednesday).
Anyway, thanks everyone for this constructive and interesting discussion ;-)
The final merged change also takes in account your remarks. For example, the flag now sets tp_new to NULL, to take Serhiy's comment in account.
This was referenced
Apr 30, 2021