bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag by vstinner · Pull Request #25721 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation31 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

vstinner

Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow
creating type instances: set tp_new to NULL and don't create the
"new" key in the type dictionary.

The flag is set automatically on static types if tp_base is NULL or
&PyBaseObject_Type and tp_new is NULL.

Use the flag on the following types:

Update MyStr example in the C API documentation to use
Py_TPFLAGS_DISALLOW_INSTANTIATION.

Add _PyStructSequence_InitType() function to create a structseq type
with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.

https://bugs.python.org/issue43916

@vstinner

This flag can be used outside CPython code base. For example, the pyrsistent project sets tp_new to NULL:

static PyObject* pyrsistent_pvectorc_moduleinit(void)
{
    ...
  // Only allow creation/initialization through factory method pvec
  PVectorType.tp_init = NULL;
  PVectorType.tp_new = NULL;

  if (PyType_Ready(&PVectorType) < 0) {
    return NULL;
  }
  ...
  Py_INCREF(&PVectorType);
  PyModule_AddObject(m, "PVector", (PyObject *)&PVectorType);
  return m;
}

@vstinner

@tiran

Nice! Do you want to update _hashopenssl.c or should I update it later?

@vstinner

Nice! Do you want to update _hashopenssl.c or should I update it later?

I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.

@erlend-aasland

Nice! Do you want to update _hashopenssl.c or should I update it later?

I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.

I can adapt #25653 to use this flag.

erlend-aasland

@tiran

@tiran

Nice! Do you want to update _hashopenssl.c or should I update it later?

I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.

Sounds good to me! I have created PR draft #25722.

tiran

shreyanavigyan

@vstinner

About the flag name, another name could be Py_TPFLAGS_DISALLOW_NEW. Or a different name.

@vstinner

About the complexity of tp_new inheritance, here is an extract of tp_new_wrapper():

    /* Check that the use doesn't do something silly and unsafe like
       object.__new__(dict).  To do this, we check that the
       most derived base that's not a heap type is this type. */
    staticbase = subtype;
    while (staticbase && (staticbase->tp_new == slot_tp_new))
        staticbase = staticbase->tp_base;
    /* If staticbase is NULL now, it is a really weird type.
       In the spirit of backwards compatibility (?), just shut up. */
    if (staticbase && staticbase->tp_new != type->tp_new) {
        PyErr_Format(PyExc_TypeError,
                     "%s.__new__(%s) is not safe, use %s.__new__()",
                     type->tp_name,
                     subtype->tp_name,
                     staticbase->tp_name);
        return NULL;
    }

@tiran

Naming is hard. Both Py_TPFLAGS_DISABLE_NEW and Py_TPFLAGS_DISALLOW_NEW reflect how Python prevents creation instances, not what the flag actually accomplishes. How about something like Py_TPFLAGS_DISALLOW_INSTANTIATION?

@erlend-aasland

I've adapted #25653 to use this flag. Regarding naming, I like @tiran's suggestion. I'll do a force push as soon as the name has landed :)

@vstinner vstinner changed the titlebpo-43916: Add Py_TPFLAGS_DISABLE_NEW type flag bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag

Apr 29, 2021

@vstinner

I renamed the flag to Py_TPFLAGS_DISALLOW_INSTANTIATION. I also updated _xxsubinterpretersmodule.ChannelID to use the new flag.

"The creation of an instance is called instantiation." says Wikipedia: https://en.wikipedia.org/wiki/Instance_(computer_science)

I compared the usage of "disallow" versus "deny" in Include/ header files and in Doc/ documentation: "disallow" is more popular.

@vstinner

@vstinner

@tiran @erlend-aasland: my PR uses the flag in 9 types (and 1 documentation example). @erlend-aasland's PR disallow instanciation in many types. Do you think that these usages are common enough to justify to add a type flag?

@vstinner

This design doesn't prevent calling directly tp_new in C. I clarified the flag documentation:

/* Disallow creating instances of the type in Python. It remains possible to
 * call directly the tp_new function in C. */

pablogsal

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a much more elegant solution, the only thing I am slightly wary of is to expose this publicly.

@serhiy-storchaka

There is already a standard way to prevent creating a new instance: setting tp_new to NULL. No new type flag is needed. BTW, tp_new == NULL is one of checks for pickleability of the type.

@vstinner

@serhiy-storchaka:

There is already a standard way to prevent creating a new instance: setting tp_new to NULL. No new type flag is needed. BTW, tp_new == NULL is one of checks for pickleability of the type.

If tp_new is set NULL after the type is created, creating an instance in Python fails with a weird error message. Example in Python 3.9:

$ python3.9
Python 3.9.4 (default, Apr  6 2021, 00:00:00) 
>>> import _curses_panel

# Good
>>> _curses_panel.panel()
TypeError: cannot create '_curses_panel.panel' instances

# Weird error
>>> _curses_panel.panel.__new__(_curses_panel.panel)
TypeError: object.__new__(_curses_panel.panel) is not safe, use _curses_panel.panel.__new__()

>>> _curses_panel.panel.__new__ is object.__new__
True

The problem is that the the type __dict__['__new__'] exists (wrapper to object.new). I would prefer to request in the type spec that instantiation is disallowed. Copy of my previous comment, extract of tp_new_wrapper():

    /* Check that the use doesn't do something silly and unsafe like
       object.__new__(dict).  To do this, we check that the
       most derived base that's not a heap type is this type. */
    staticbase = subtype;
    while (staticbase && (staticbase->tp_new == slot_tp_new))
        staticbase = staticbase->tp_base;
    /* If staticbase is NULL now, it is a really weird type.
       In the spirit of backwards compatibility (?), just shut up. */
    if (staticbase && staticbase->tp_new != type->tp_new) {
        PyErr_Format(PyExc_TypeError,
                     "%s.__new__(%s) is not safe, use %s.__new__()",
                     type->tp_name,
                     subtype->tp_name,
                     staticbase->tp_name);
        return NULL;
    }

@vstinner

Extract of inherit_special():

        /* The condition below could use some explanation.
           It appears that tp_new is not inherited for static types
           whose base class is 'object'; this seems to be a precaution
           so that old extension types don't suddenly become
           callable (object.__new__ wouldn't insure the invariants
           that the extension type's own factory function ensures).
           Heap types, of course, are under our control, so they do
           inherit tp_new; static extension types that specify some
           other built-in type as the default also
           inherit object.__new__. */
        if (base != &PyBaseObject_Type ||
            (type->tp_flags & Py_TPFLAGS_HEAPTYPE)) {
            if (type->tp_new == NULL)
                type->tp_new = base->tp_new;
        }

PyType_FromSpec() creates a type and then calls PyType_Ready() on it. PyType_Ready() calls inherit_special() which inherits tp_new (since it's a heap type), and then creates tp_dict["__new__"] since tp_new is not NULL anymore.

@vstinner

@serhiy-storchaka: I updated the PR. The flag now explicitly sets tp_new to NULL. So code checking if tp_new is NULL (like pickle) is not affected.

Advantages of using the new flag compared to setting tp_new to NULL:

@vstinner

I mark this PR as a draft. I'm trying to modify PyType_FromSpec() to support {Py_tp_new, NULL}, slot, so we can compare the three options (Erlend's PR, this PR, and the future PyType_FromSpec PR).

@vstinner

Ok, there are now 3 API options to solve https://bugs.python.org/issue43916:

There are 3 main ways to create types

A drawbacks:

B drawbacks:

C drawbacks:

@vstinner vstinner marked this pull request as ready for review

April 29, 2021 22:06

@pablogsal

My preference is B, even if we can manually set tp_new to NULL it has a bit of a "hacky" feeling. I find the flag the most simple and elegant of all, but is true that still feels just a bit weird that a flag will disallow instantiation.

My preference order is B-> C -> A

@erlend-aasland

My preference order is B-> C -> A

Mine too, FWIW.

@vstinner

@pablogsal:

My preference is B, even if we can manually set tp_new to NULL it has a bit of a "hacky" feeling. I find the flag the most simple and elegant of all, but is true that still feels just a bit weird that a flag will disallow instantiation.

In Python, there is a similar "hack" to mark a type as non hashable:

$ python3
Python 3.9.4 (default, Apr  6 2021, 00:00:00) 
>>> class A:
...  pass

>>> hash(A())
8731944244974

>>> class B:
...  __hash__ = None

>>> hash(B())
TypeError: unhashable type: 'B'

@vstinner

In Python, setting __new__ to None prevents instantiation, but with a surprising error message ;-)

$ python3 
Python 3.9.4 (default, Apr  6 2021, 00:00:00) 
>>> class A:
...  __new__ = None

>>> A()
TypeError: 'NoneType' object is not callable

@pablogsal

In Python, setting __new__ to None prevents instantiation, but with a surprising error message ;-)

I wouldn't call that surprising TBH :)

@vstinner

Another slice of type inheritance (tp_new) complexity. Extract of update_one_slot() which is called by type_new() after PyType_Ready():

        else if (Py_IS_TYPE(descr, &PyCFunction_Type) &&
                 PyCFunction_GET_FUNCTION(descr) ==
                 (PyCFunction)(void(*)(void))tp_new_wrapper &&
                 ptr == (void**)&type->tp_new)
        {
            /* The __new__ wrapper is not a wrapper descriptor,
               so must be special-cased differently.
               If we don't do this, creating an instance will
               always use slot_tp_new which will look up
               __new__ in the MRO which will call tp_new_wrapper
               which will look through the base classes looking
               for a static base and call its tp_new (usually
               PyType_GenericNew), after performing various
               sanity checks and constructing a new argument
               list.  Cut all that nonsense short -- this speeds
               up instance creation tremendously. */
            specific = (void *)type->tp_new;
            /* XXX I'm not 100% sure that there isn't a hole
               in this reasoning that requires additional
               sanity checks.  I'll buy the first person to
               point out a bug in this reasoning a beer. */
        }

@erlend-aasland

@vstinner

Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow creating type instances: set tp_new to NULL and don't create the "new" key in the type dictionary.

The flag is set automatically on static types if tp_base is NULL or &PyBaseObject_Type and tp_new is NULL.

Use the flag on the following types:

Update MyStr example in the C API documentation to use Py_TPFLAGS_DISALLOW_INSTANTIATION.

Add _PyStructSequence_InitType() function to create a structseq type with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.

type_new() calls _PyType_CheckConsistency() at exit.

@vstinner

PR rebased to fix a conflict.

@vstinner

I took @pablogsal and @serhiy-storchaka concerns in account. Pablo would prefer to avoid adding a public type, Serhiy would prefer to modify PyType_FromSpec() to accept tp_new=NULL. I implemented Serhiy's idea to compare the API with this Py_TPFLAGS_DISALLOW_INSTANTIATION PR. Most people prefer the explicit Py_TPFLAGS_DISALLOW_INSTANTIATION flag which is less surprising than tp_new=NULL. I also listed other advantages of the Py_TPFLAGS_DISALLOW_INSTANTIATION flag in my previous comments.

Usually, I would prefer to get a clear consensus, but we are out of time: Python 3.10 beta1 (feature freeze) is next Wednesday, and I will be in holiday for 1 week starting tonight. If there is still a disagreement, we can still revisit the API or the implementation before Python 3.10 final release ("3.10.0 final: Monday, 2021-10-04" says PEP 619).

I merged my PR to unblock https://bugs.python.org/issue43916 release blocker issue (and there are other open release blocker issues to be solved before Wednesday).

Anyway, thanks everyone for this constructive and interesting discussion ;-)

The final merged change also takes in account your remarks. For example, the flag now sets tp_new to NULL, to take Serhiy's comment in account.

This was referenced

Apr 30, 2021