[Python-Dev] Mini-Pep: An Empty String ABC (original) (raw)

Guido van Rossum guido at python.org
Mon Jun 2 01:06:10 CEST 2008


This PEP is incomplete without specifying exactly which built-in and stdlib types should be registered as String instances.

I'm also confused -- the motivation seems mostly "so that you can skip iterating over it when flattening a nested sequence" but earlier you rejected my "Atomic" proposal, saying "Earlier in the thread it was made clear that that atomicity is not an intrinsic property of a type; instead it varies across applications [...]". Isn't this String proposal just that by another name?

Finally, I fully expect lots of code writing isinstance(x, String) and making many more assumptions than promised by the String ABC. For example that s[0] has the same type as s (not true for bytes). Or that it is hashable (the Sequence class doesn't define hash). Or that s1+s2 will work (not in the Sequence class either). And many more.

All this makes me lean towards a rejection of this proposal -- it seems worse than no proposal at all. It could perhaps be rescued by adding some small set of defined operations.

--Guido

On Sat, May 31, 2008 at 11:59 PM, Raymond Hettinger <python at rcn.com> wrote:

Mini-Pep: An Empty String ABC Target: Py2.6 and Py3.0 Author: Raymond Hettinger

Proposal -------- Add a new collections ABC specified as: class String(Sequence): pass Motivation ---------- Having an ABC for strings allows string look-alike classes to declare themselves as sequences that contain text. Client code (such as a flatten operation or tree searching tool) may use that ABC to usefully differentiate strings from other sequences (i.e. containers vs containees). And in code that only relies on sequence behavior, isinstance(x,str) may be usefully replaced by isinstance(x,String) so that look-alikes can be substituted in calling code. A natural temptation is add other methods to the String ABC, but strings are a tough case. Beyond simple sequence manipulation, the string methods get very complex. An ABC that included those methods would make it tough to write a compliant class that could be registered as a String. The split(), rsplit(), partition(), and rpartition() methods are examples of methods that would be difficult to emulate correctly. Also, starting with Py3.0, strings are essentially abstract sequences of code points, meaning that an encode() method is essential to being able to usefully transform them back into concrete data. Unfortunately, the encode method is so complex that it cannot be readily emulated by an aspiring string look-alike. Besides complexity, another problem with the concrete str API is the extensive number of methods. If string look-alikes were required to emulate the likes of zfill(), ljust(), title(), translate(), join(), etc., it would significantly add to the burden of writing a class complying with the String ABC. The fundamental problem is that of balancing a client function's desire to rely on a broad number of behaviors against the difficulty of writing a compliant look-alike class. For other ABCs, the balance is more easily struck because the behaviors are fewer in number, because they are easier to implement correctly, and because some methods can be provided as mixins. For a String ABC, the balance should lean toward minimalism due to the large number of methods and how difficult it is to implement some of the correctly. A last reason to avoid expanding the String API is that almost none of the candidate methods characterize the notion of "stringiness". With something calling itself an integer, an add() method would be expected as it is fundamental to the notion of "integeriness". In contrast, methods like startswith() and title() are non-essential extras -- we would not discount something as being not stringlike if those methods were not present.


Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list