Issue 1261: PEP 3137: make bytesobject.c methods (original) (raw)

Created on 2007-10-11 08:45 by gregory.p.smith, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bytes-pep3137-methods-04b.diff.txt gregory.p.smith,2007-10-16 01:05
Messages (12)
msg56341 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-10-11 08:45
This makes all existing bytesobject.c methods use the buffer API rather than explicitly requiring bytes objects as input. It also fixes input to append() and remove() that was not strict enough and improves a few unit tests in that area. NOTE: this patch likely depends on http://bugs.python.org/issue1260 removing the buffer API from the unicode type in order for all unit tests to pass (i only tested it with that applied since thats where we're headed).
msg56351 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-10-12 00:56
Patch updated. It now implements the is*() methods for PyBytes. It moves common code into a shared bytes_ctype.c and .h file so that stringobject.c and bytesobject.c can share as much as possible. Unit tests are updated as needed for new method coverage and any behavior changes. still TODO: adding the missing methods (listed in a comment in the patch). I'm working on it.
msg56352 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-10-12 02:42
> Patch updated. It now implements the is*() methods for PyBytes. It > moves common code into a shared bytes_ctype.c and .h file so that > stringobject.c and bytesobject.c can share as much as possible. Did you move this into the stringlib subdirectory? That's more for sharing between PyString and PyUnicode, but I think there are more opportunities for sharing still, and PyString/PyBytes sharing makes sense separately.
msg56354 - (view) Author: Gregory P. Smith (gps) Date: 2007-10-12 03:20
> > Patch updated. It now implements the is*() methods for PyBytes. It > > moves common code into a shared bytes_ctype.c and .h file so that > > stringobject.c and bytesobject.c can share as much as possible. > > Did you move this into the stringlib subdirectory? That's more for > sharing between PyString and PyUnicode, but I think there are more > opportunities for sharing still, and PyString/PyBytes sharing makes > sense separately. Good idea, I haven't done that yet. At the moment it lives in Include/bytes_ctype.h and Object/bytes_ctype.c directly. stringlib is a good place for it and is something I pondered but hadn't gotten to. I'll do that as I implement the remaining missing PyBytes_ methods to be in the next update to this patch. -gps
msg56466 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-10-15 22:02
Is it worth my time to review this yet?
msg56468 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-10-15 22:36
here's the latest update. this takes care of all methods listed in the PEP i believe. please review. i'm currently working porting some test case code for more methods from string_tests.py to buffer_tests.py to be shared between PyString and PyBytes similar to the ones already in the patch that way.
msg56474 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-10-15 23:48
Same diff 03 here but it also adds unit test coverage for the new PyBytes buffer methods.
msg56477 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-10-16 00:19
Very impressive! (Apologies if these lines are occasionally out of order.) +extern PyObject* _bytes_isspace(const char *cptr, const Py_ssize_t len); IMO all these functions should have names starting with _Py or _Py_, since they are visible to the linker. Also, why 'const Py_ssize_t'? It's a scalar! +/* these store their len sized answer in the given preallocated *result arg */ Mind using a Capital first letter? +extern const char isspace__doc__[]; This needs a _Py or _Py_ prefix too. +extern const unsigned int ctype_table[256]; Now this is no longer static, it also needs a _Py or _Py_ prefix. +extern const unsigned char ctype_tolower[256]; +extern const unsigned char ctype_toupper[256]; Ditto (the macros are fine though, since they are only visible to code that #includes this file, which is only a few files). +Return True if all characters in S are whitespace\n\ Shouldn't that be bytes instead of characters (and consistently throughout)? And probably use B for the variable name instead of S. +/* ------- GPS --------- */ What's this? Your initials? :-) I don't think it's needed. I'm guessing you /* The nullbytes are used by the stringlib during partition. * If partition is removed from bytes, nullbytes and its helper This comment block refers to something that ain't gonna happen (partition being removed). IMO the whole comment block can go, it's a red herring. + /* XXX: this depends on a signed integer overflow to < 0 */ + /* C compilers, including gcc, do -NOT- guarantee this. */ (And repeated twice more.) Wouldn't it be better to write this in a way that doesn't require this XXX comment? ISTR that we already had one area where we had to fight with gcc because it had proved to itself that something could never be negative, even though it could be due to overflow. The GCC authors won. :-( +/* TODO(gps): Don't you mean XXX(gps)? :-) + * These methods still need implementing (porting over from stringobject.c): + * + * IN PROGRESS: + * .capitalize(), .title(), .upper(), .lower(), .center(), .zfill(), + * .expandtabs(), .ljust(), .rjust(), .swapcase(), .splitlines(), + */ + Hmmm... Aren't these done now? + /* XXX(gps): the docstring above says any iterable int will do but the + * bytes_setslice code only accepts something supporting PEP 3118. + * Is a list or tuple of 0 <= ints <= 255 also supposed to work? */ Yes, it is, and it's a bug that it currently doesn't.
msg56479 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-10-16 00:53
Ok, I believe I've fixed everything you commented on, accidentally included in initials used as a search placeholder included. :) > + /* XXX: this depends on a signed integer overflow to < >0 */ > + /* C compilers, including gcc, do -NOT- guarantee >this. */ > (And repeated twice more.) Wouldn't it be better to write this in a way > that doesn't require this XXX comment? ISTR that we already had one > area where we had to fight with gcc because it had proved to itself that > something could never be negative, even though it could be due to > overflow. The GCC authors won. :-( This code was copied as is from stringobject.c, I just added the XXX notes upon reading it. There is an existing unittest to try and make sure the code has compiled properly with these tests (its not comprehensive, it'll only test one of these three test cases). Regardless, yes, it needs fixing. I didn't want to take that on that as part of this patch.
msg56480 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-10-16 01:05
04b is the same as 04, i just fixed the docstrings that i had missed in stringlib/transmogrify.h to use 'B' instead of 'S' and say they return a "modified copy of B" instead of a "string" wording could probably be improved further if anyone has any ideas.
msg56487 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-10-16 03:27
Good! Check it in before I change my mind. :-) The words can be tweaked later. > 04b is the same as 04, i just fixed the docstrings that i had missed in > stringlib/transmogrify.h to use 'B' instead of 'S' and say they return a > "modified copy of B" instead of a "string" > > wording could probably be improved further if anyone has any ideas.
msg56495 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-10-16 06:33
Committed revision 58493
History
Date User Action Args
2022-04-11 14:56:27 admin set github: 45602
2008-01-06 22:29:45 admin set keywords: - py3kversions: Python 3.0
2007-10-16 06:33:39 gregory.p.smith set files: - bytes-pep3137-methods-04.diff.txt
2007-10-16 06:33:22 gregory.p.smith set status: open -> closedresolution: acceptedmessages: +
2007-10-16 03:27:03 gvanrossum set messages: +
2007-10-16 01:05:07 gregory.p.smith set files: + bytes-pep3137-methods-04b.diff.txtmessages: +
2007-10-16 00:53:53 gregory.p.smith set files: - bytes-pep3137-methods-03-with-tests.diff.txt
2007-10-16 00:53:50 gregory.p.smith set files: - bytes-pep3137-methods-03.diff.txt
2007-10-16 00:53:39 gregory.p.smith set files: + bytes-pep3137-methods-04.diff.txtmessages: +
2007-10-16 00:19:34 gvanrossum set messages: +
2007-10-15 23:49:00 gregory.p.smith set files: + bytes-pep3137-methods-03-with-tests.diff.txtmessages: +
2007-10-15 22:36:36 gregory.p.smith set files: - bytes-pep3137-methods-02.diff.txt
2007-10-15 22:36:28 gregory.p.smith set files: + bytes-pep3137-methods-03.diff.txtmessages: +
2007-10-15 22:02:16 gvanrossum set messages: +
2007-10-12 03:35:50 gregory.p.smith set nosy: - gps
2007-10-12 03:21:41 gregory.p.smith set files: - unnamed
2007-10-12 03:20:59 gps set files: + unnamednosy: + gpsmessages: +
2007-10-12 02:42:33 gvanrossum set messages: +
2007-10-12 00:57:05 gregory.p.smith set files: - bytes-methods-use-buffer-api-01.diff.txt
2007-10-12 00:56:55 gregory.p.smith set files: + bytes-pep3137-methods-02.diff.txtmessages: + title: PEP 3137: make bytesobject.c methods use PEP 3118 buffer API -> PEP 3137: make bytesobject.c methods
2007-10-11 16:52:28 gvanrossum set assignee: gvanrossumnosy: + gvanrossum
2007-10-11 08:45:43 gregory.p.smith create