[Python-Dev] Documenting the ssize_t Python C API changes (original) (raw)
M.-A. Lemburg mal at egenix.com
Tue Mar 21 12:26:41 CET 2006
- Previous message: [Python-Dev] Documenting the ssize_t Python C API changes
- Next message: [Python-Dev] Documenting the ssize_t Python C API changes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Martin v. Löwis wrote:
M.-A. Lemburg wrote:
It's not a waste of time at all: you'd be helping lots and lots of developers out there who want to fix their extensions. This is free software, anybody is free to decide what they do.
With due respect for other developers, yes.
I don't believe that developers would be helped a lot - they can easily search for Pyssizet in the header files, and find all the APIs that have changed.
Of course they can. We could also stop writing documentation and tell users to read the code instead - after all, it's all there, ready to be consumed by interested parties. Oh, and for changes: we'll just point them to Subversion and tell them to run a 'svn diff'.
However, they should not have to do that. Instead, they should look at it from a conceptual point of view: Does that variable "count" something (memory, number of elements, size of some structure). If it does, and it currently counts that using an int, it should be changed to use a Pyssizet instead.
So just review all occurrences of int in your code, and you are done. No need to look at API lists.
Just did a grep on the mx Extensions: 17000 cases of 'int' being used. Sounds like a nice weekend activity...
Seriously, your suggestion on how to port the extensions to Py_ssize_t is certainly true, but this may not be what all extension authors would want to do (or at least not right away). Instead, they'll want to know what changed and then check their code for uses of the changed APIs, in particular those APIs where output parameters are used.
I think that documenting these changes is part of doing responsible development. You seem to disagree.
The ssizet patch is the single most disruptive patch in Python 2.5, so it deserves special attention. I can believe you that you would have preferred not to see that patch at all, not at this time, and preferably never. I have a different view. I don't see it as a problem, but as a solution.
You are right in that I would have rather seen this change go into Py3k than into the 2.x series. You're wrong in saying that I would have preferred not to get such a change into Python at all.
I've given up believing that there would be a possibility of having code that works in both Py3k and Py2.x. I've also given up, believing that code written for Py2.x will continue to work in Py3k.
I still holding on to the belief that the 2.x will not introduce major breakage between the versions and that there'll always be some way to write software that works in 2.n and 2.n+1 for any n.
However, I feel that at least some Python developers seem to be OK with breaking this possibility, ignoring all the existing working code that's out there.
Again, if you think the documentation should be improved, go ahead and improve it.
Here's a grep of all the changed/new APIs, please include it in the PEP.
./dictobject.h: -- PyAPI_FUNC(int) PyDict_Next( -- PyObject *mp, Py_ssize_t *pos, PyObject **key, PyObject **value); -- PyAPI_FUNC(Py_ssize_t) PyDict_Size(PyObject *mp); ./pyerrors.h: -- PyAPI_FUNC(PyObject *) PyUnicodeDecodeError_Create( -- const char *, const char *, Py_ssize_t, Py_ssize_t, Py_ssize_t, const char *); -- PyAPI_FUNC(PyObject *) PyUnicodeEncodeError_Create( -- const char *, const Py_UNICODE *, Py_ssize_t, Py_ssize_t, Py_ssize_t, const char *); -- PyAPI_FUNC(PyObject *) PyUnicodeTranslateError_Create( -- const Py_UNICODE *, Py_ssize_t, Py_ssize_t, Py_ssize_t, const char *); -- PyAPI_FUNC(int) PyUnicodeEncodeError_GetStart(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeDecodeError_GetStart(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeTranslateError_GetStart(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeEncodeError_SetStart(PyObject *, Py_ssize_t); -- PyAPI_FUNC(int) PyUnicodeDecodeError_SetStart(PyObject *, Py_ssize_t); -- PyAPI_FUNC(int) PyUnicodeTranslateError_SetStart(PyObject *, Py_ssize_t); -- PyAPI_FUNC(int) PyUnicodeEncodeError_GetEnd(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeDecodeError_GetEnd(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeTranslateError_GetEnd(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeEncodeError_SetEnd(PyObject *, Py_ssize_t); -- PyAPI_FUNC(int) PyUnicodeDecodeError_SetEnd(PyObject *, Py_ssize_t); -- PyAPI_FUNC(int) PyUnicodeTranslateError_SetEnd(PyObject *, Py_ssize_t); ./tupleobject.h: -- PyAPI_FUNC(PyObject *) PyTuple_New(Py_ssize_t size); -- PyAPI_FUNC(Py_ssize_t) PyTuple_Size(PyObject *); -- PyAPI_FUNC(PyObject *) PyTuple_GetItem(PyObject *, Py_ssize_t); -- PyAPI_FUNC(int) PyTuple_SetItem(PyObject *, Py_ssize_t, PyObject *); -- PyAPI_FUNC(PyObject *) PyTuple_GetSlice(PyObject *, Py_ssize_t, Py_ssize_t); -- PyAPI_FUNC(int) _PyTuple_Resize(PyObject **, Py_ssize_t); -- PyAPI_FUNC(PyObject *) PyTuple_Pack(Py_ssize_t, ...); ./sliceobject.h: -- PyAPI_FUNC(int) PySlice_GetIndices(PySliceObject *r, Py_ssize_t length, -- Py_ssize_t *start, Py_ssize_t *stop, Py_ssize_t *step); -- PyAPI_FUNC(int) PySlice_GetIndicesEx(PySliceObject *r, Py_ssize_t length, -- Py_ssize_t *start, Py_ssize_t *stop, -- Py_ssize_t *step, Py_ssize_t *slicelength); ./bufferobject.h: -- PyAPI_FUNC(PyObject *) PyBuffer_FromObject(PyObject *base, -- Py_ssize_t offset, Py_ssize_t size); -- PyAPI_FUNC(PyObject *) PyBuffer_FromReadWriteObject(PyObject *base, -- Py_ssize_t offset, -- Py_ssize_t size); -- PyAPI_FUNC(PyObject *) PyBuffer_FromMemory(void *ptr, Py_ssize_t size); -- PyAPI_FUNC(PyObject *) PyBuffer_FromReadWriteMemory(void *ptr, Py_ssize_t size); -- PyAPI_FUNC(PyObject *) PyBuffer_New(Py_ssize_t size); ./marshal.h: -- PyAPI_FUNC(PyObject *) PyMarshal_ReadObjectFromString(char *, Py_ssize_t); ./stringobject.h: -- PyAPI_FUNC(PyObject *) PyString_FromStringAndSize(const char *, Py_ssize_t); -- PyAPI_FUNC(Py_ssize_t) PyString_Size(PyObject *); -- PyAPI_FUNC(int) _PyString_Resize(PyObject **, Py_ssize_t); -- PyAPI_FUNC(PyObject *) PyString_DecodeEscape(const char *, Py_ssize_t, -- const char *, Py_ssize_t, -- const char ); -- PyAPI_FUNC(PyObject) PyString_Decode( -- const char s, / encoded string / -- Py_ssize_t size, / size of buffer */ -- const char encoding, / encoding */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyString_Encode( -- const char s, / string char buffer / -- Py_ssize_t size, / number of chars to encode */ -- const char encoding, / encoding */ -- const char errors / error handling */ -- ); -- PyAPI_FUNC(int) PyString_AsStringAndSize( -- register PyObject obj, / string or Unicode object */ -- register char *s, / pointer to buffer variable */ -- register Py_ssize_t len / pointer to length variable or NULL -- (only possible for 0-terminated -- strings) */ -- ); ./longobject.h: -- PyAPI_FUNC(Py_ssize_t) _PyLong_AsSsize_t(PyObject *); -- PyAPI_FUNC(PyObject *) _PyLong_FromSsize_t(Py_ssize_t); -- PyAPI_FUNC(PyObject ) PyLong_FromUnicode(Py_UNICODE, Py_ssize_t, int); ./object.h: -- PyAPI_FUNC(PyObject *) PyType_GenericAlloc(PyTypeObject *, Py_ssize_t); ./intobject.h: -- PyAPI_FUNC(PyObject ) PyInt_FromUnicode(Py_UNICODE, Py_ssize_t, int); -- PyAPI_FUNC(PyObject *) PyInt_FromSsize_t(Py_ssize_t); -- PyAPI_FUNC(Py_ssize_t) PyInt_AsSsize_t(PyObject *); ./objimpl.h: -- PyAPI_FUNC(PyVarObject *) PyObject_InitVar(PyVarObject *, -- PyTypeObject *, Py_ssize_t); -- PyAPI_FUNC(PyVarObject *) _PyObject_NewVar(PyTypeObject *, Py_ssize_t); -- PyAPI_FUNC(PyVarObject *) _PyObject_GC_Resize(PyVarObject *, Py_ssize_t); -- PyAPI_FUNC(PyVarObject *) _PyObject_GC_NewVar(PyTypeObject *, Py_ssize_t); ./abstract.h: -- PyAPI_FUNC(Py_ssize_t) PyObject_Size(PyObject *o); -- PyAPI_FUNC(Py_ssize_t) PyObject_Length(PyObject *o); -- PyAPI_FUNC(Py_ssize_t) _PyObject_LengthHint(PyObject *o); -- PyAPI_FUNC(int) PyObject_AsCharBuffer(PyObject *obj, -- const char **buffer, -- Py_ssize_t *buffer_len); -- PyAPI_FUNC(int) PyObject_AsReadBuffer(PyObject *obj, -- const void **buffer, -- Py_ssize_t *buffer_len); -- PyAPI_FUNC(int) PyObject_AsWriteBuffer(PyObject *obj, -- void **buffer, -- Py_ssize_t *buffer_len); -- PyAPI_FUNC(Py_ssize_t) PySequence_Size(PyObject *o); -- PyAPI_FUNC(Py_ssize_t) PySequence_Length(PyObject *o); -- PyAPI_FUNC(PyObject *) PySequence_Repeat(PyObject *o, Py_ssize_t count); -- PyAPI_FUNC(PyObject *) PySequence_GetItem(PyObject *o, Py_ssize_t i); -- PyAPI_FUNC(PyObject *) PySequence_GetSlice(PyObject *o, Py_ssize_t i1, Py_ssize_t i2); -- PyAPI_FUNC(int) PySequence_SetItem(PyObject *o, Py_ssize_t i, PyObject *v); -- PyAPI_FUNC(int) PySequence_DelItem(PyObject *o, Py_ssize_t i); -- PyAPI_FUNC(int) PySequence_SetSlice(PyObject *o, Py_ssize_t i1, Py_ssize_t i2, -- PyObject *v); -- PyAPI_FUNC(int) PySequence_DelSlice(PyObject *o, Py_ssize_t i1, Py_ssize_t i2); -- PyAPI_FUNC(PyObject *) PySequence_InPlaceRepeat(PyObject *o, Py_ssize_t count); -- PyAPI_FUNC(Py_ssize_t) PyMapping_Size(PyObject *o); -- PyAPI_FUNC(Py_ssize_t) PyMapping_Length(PyObject o); ./unicodeobject.h: -- PyAPI_FUNC(PyObject) PyUnicode_FromUnicode( -- const Py_UNICODE u, / Unicode buffer / -- Py_ssize_t size / size of buffer */ -- ); -- PyAPI_FUNC(Py_ssize_t) PyUnicode_GetSize( -- PyObject unicode / Unicode object */ -- ); -- PyAPI_FUNC(int) PyUnicode_Resize( -- PyObject *unicode, / Pointer to the Unicode object / -- Py_ssize_t length / New length / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_FromWideChar( -- register const wchar_t w, / wchar_t buffer / -- Py_ssize_t size / size of buffer */ -- ); -- PyAPI_FUNC(Py_ssize_t) PyUnicode_AsWideChar( -- PyUnicodeObject unicode, / Unicode object */ -- register wchar_t w, / wchar_t buffer / -- Py_ssize_t size / size of buffer / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_Decode( -- const char s, / encoded string / -- Py_ssize_t size, / size of buffer */ -- const char encoding, / encoding */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_Encode( -- const Py_UNICODE s, / Unicode char buffer / -- Py_ssize_t size, / number of Py_UNICODE chars to encode */ -- const char encoding, / encoding */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeUTF7( -- const char string, / UTF-7 encoded string / -- Py_ssize_t length, / size of string */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeUTF7( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / number of Py_UNICODE chars to encode / -- int encodeSetO, / force the encoder to encode characters in -- Set O, as described in RFC2152 / -- int encodeWhiteSpace, / force the encoder to encode space, tab, -- carriage return and linefeed characters */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeUTF8( -- const char string, / UTF-8 encoded string / -- Py_ssize_t length, / size of string */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeUTF8Stateful( -- const char string, / UTF-8 encoded string / -- Py_ssize_t length, / size of string */ -- const char errors, / error handling */ -- Py_ssize_t consumed / bytes consumed / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeUTF8( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / number of Py_UNICODE chars to encode */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeUTF16( -- const char string, / UTF-16 encoded string / -- Py_ssize_t length, / size of string */ -- const char errors, / error handling / -- int byteorder / pointer to byteorder to use -- 0=native;-1=LE,1=BE; updated on -- PyAPI_FUNC(PyObject) PyUnicode_DecodeUTF16Stateful( -- const char string, / UTF-16 encoded string / -- Py_ssize_t length, / size of string */ -- const char errors, / error handling / -- int byteorder, / pointer to byteorder to use -- 0=native;-1=LE,1=BE; updated on -- PyAPI_FUNC(PyObject) PyUnicode_EncodeUTF16( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / number of Py_UNICODE chars to encode */ -- const char errors, / error handling / -- int byteorder / byteorder to use 0=BOM+native;-1=LE,1=BE / -- PyAPI_FUNC(PyObject) PyUnicode_DecodeUnicodeEscape( -- const char string, / Unicode-Escape encoded string / -- Py_ssize_t length, / size of string */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeUnicodeEscape( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length / Number of Py_UNICODE chars to encode / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeRawUnicodeEscape( -- const char string, / Raw-Unicode-Escape encoded string / -- Py_ssize_t length, / size of string */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeRawUnicodeEscape( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length / Number of Py_UNICODE chars to encode / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeLatin1( -- const char string, / Latin-1 encoded string / -- Py_ssize_t length, / size of string */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeLatin1( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / Number of Py_UNICODE chars to encode */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeASCII( -- const char string, / ASCII encoded string / -- Py_ssize_t length, / size of string */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeASCII( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / Number of Py_UNICODE chars to encode */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeCharmap( -- const char string, / Encoded string / -- Py_ssize_t length, / size of string */ -- PyObject mapping, / character mapping -- (char ordinal -> unicode ordinal) */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeCharmap( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / Number of Py_UNICODE chars to encode */ -- PyObject mapping, / character mapping -- (unicode ordinal -> char ordinal) */ -- const char errors / error handling */ -- ); -- PyAPI_FUNC(PyObject *) PyUnicode_TranslateCharmap( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / Number of Py_UNICODE chars to encode */ -- PyObject table, / Translate table */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_DecodeMBCS( -- const char string, / MBCS encoded string / -- Py_ssize_t length, / size of string */ -- const char errors / error handling / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_EncodeMBCS( -- const Py_UNICODE data, / Unicode char buffer / -- Py_ssize_t length, / Number of Py_UNICODE chars to encode */ -- const char errors / error handling */ -- ); -- PyAPI_FUNC(int) PyUnicode_EncodeDecimal( -- Py_UNICODE s, / Unicode buffer / -- Py_ssize_t length, / Number of Py_UNICODE chars to encode */ -- char output, / Output buffer; must have size >= length / -- PyAPI_FUNC(PyObject) PyUnicode_Split( -- PyObject s, / String to split */ -- PyObject sep, / String separator / -- Py_ssize_t maxsplit / Maxsplit count / -- ); -- PyAPI_FUNC(PyObject) PyUnicode_RSplit( -- PyObject s, / String to split */ -- PyObject sep, / String separator / -- Py_ssize_t maxsplit / Maxsplit count */ -- ); -- PyAPI_FUNC(Py_ssize_t) PyUnicode_Tailmatch( -- PyObject str, / String */ -- PyObject substr, / Prefix or Suffix string / -- Py_ssize_t start, / Start index / -- Py_ssize_t end, / Stop index / -- int direction / Tail end: -1 prefix, +1 suffix */ -- ); -- PyAPI_FUNC(Py_ssize_t) PyUnicode_Find( -- PyObject str, / String */ -- PyObject substr, / Substring to find / -- Py_ssize_t start, / Start index / -- Py_ssize_t end, / Stop index / -- int direction / Find direction: +1 forward, -1 backward */ -- ); -- PyAPI_FUNC(Py_ssize_t) PyUnicode_Count( -- PyObject str, / String */ -- PyObject substr, / Substring to count / -- Py_ssize_t start, / Start index / -- Py_ssize_t end / Stop index */ -- ); -- PyAPI_FUNC(PyObject *) PyUnicode_Replace( -- PyObject str, / String */ -- PyObject substr, / Substring to find */ -- PyObject replstr, / Substring to replace / -- Py_ssize_t maxcount / Max. number of replacements to apply; ./ceval.h: -- PyAPI_FUNC(int) _PyEval_SliceIndex(PyObject *, Py_ssize_t *); ./listobject.h: -- PyAPI_FUNC(PyObject *) PyList_New(Py_ssize_t size); -- PyAPI_FUNC(Py_ssize_t) PyList_Size(PyObject *); -- PyAPI_FUNC(PyObject *) PyList_GetItem(PyObject *, Py_ssize_t); -- PyAPI_FUNC(int) PyList_SetItem(PyObject *, Py_ssize_t, PyObject *); -- PyAPI_FUNC(int) PyList_Insert(PyObject *, Py_ssize_t, PyObject *); -- PyAPI_FUNC(PyObject *) PyList_GetSlice(PyObject *, Py_ssize_t, Py_ssize_t); -- PyAPI_FUNC(int) PyList_SetSlice(PyObject *, Py_ssize_t, Py_ssize_t, PyObject *); ./longintrepr.h: -- PyAPI_FUNC(PyLongObject *) _PyLong_New(Py_ssize_t);
This is an excerpt of just the changes to output parameters causing buffer overflows if not corrected in all extensions using them:
./dictobject.h: -- PyAPI_FUNC(int) PyDict_Next( -- PyObject *mp, Py_ssize_t *pos, PyObject **key, PyObject **value); ./pyerrors.h: -- PyAPI_FUNC(int) PyUnicodeEncodeError_GetStart(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeDecodeError_GetStart(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeTranslateError_GetStart(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeEncodeError_GetEnd(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeDecodeError_GetEnd(PyObject *, Py_ssize_t *); -- PyAPI_FUNC(int) PyUnicodeTranslateError_GetEnd(PyObject *, Py_ssize_t *); ./sliceobject.h: -- PyAPI_FUNC(int) PySlice_GetIndices(PySliceObject *r, Py_ssize_t length, -- Py_ssize_t *start, Py_ssize_t *stop, Py_ssize_t *step); -- PyAPI_FUNC(int) PySlice_GetIndicesEx(PySliceObject *r, Py_ssize_t length, -- Py_ssize_t *start, Py_ssize_t *stop, -- Py_ssize_t *step, Py_ssize_t *slicelength); ./stringobject.h: -- PyAPI_FUNC(int) PyString_AsStringAndSize( -- register PyObject obj, / string or Unicode object */ -- register char *s, / pointer to buffer variable */ -- register Py_ssize_t len / pointer to length variable or NULL -- (only possible for 0-terminated -- strings) */ -- ); ./abstract.h: -- PyAPI_FUNC(int) PyObject_AsCharBuffer(PyObject *obj, -- const char **buffer, -- Py_ssize_t *buffer_len); -- PyAPI_FUNC(int) PyObject_AsReadBuffer(PyObject *obj, -- const void **buffer, -- Py_ssize_t *buffer_len); -- PyAPI_FUNC(int) PyObject_AsWriteBuffer(PyObject *obj, -- void **buffer, -- Py_ssize_t buffer_len); ./unicodeobject.h: -- PyAPI_FUNC(PyObject) PyUnicode_DecodeUTF8Stateful( -- const char string, / UTF-8 encoded string / -- Py_ssize_t length, / size of string */ -- const char errors, / error handling */ -- Py_ssize_t consumed / bytes consumed */ -- ); ./ceval.h: -- PyAPI_FUNC(int) _PyEval_SliceIndex(PyObject *, Py_ssize_t *);
Thanks,
Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Mar 21 2006)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
- Previous message: [Python-Dev] Documenting the ssize_t Python C API changes
- Next message: [Python-Dev] Documenting the ssize_t Python C API changes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]