[Python-Dev] Mini-Pep: An Empty String ABC (original) (raw)
Raymond Hettinger python at rcn.com
Sun Jun 1 08:59:00 CEST 2008
- Previous message: [Python-Dev] Mini-Pep: Simplifying the Integral ABC
- Next message: [Python-Dev] Mini-Pep: An Empty String ABC
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mini-Pep: An Empty String ABC Target: Py2.6 and Py3.0 Author: Raymond Hettinger
Proposal
Add a new collections ABC specified as:
class String(Sequence):
pass
Motivation
Having an ABC for strings allows string look-alike classes to declare themselves as sequences that contain text. Client code (such as a flatten operation or tree searching tool) may use that ABC to usefully differentiate strings from other sequences (i.e. containers vs containees). And in code that only relies on sequence behavior, isinstance(x,str) may be usefully replaced by isinstance(x,String) so that look-alikes can be substituted in calling code.
A natural temptation is add other methods to the String ABC, but strings are a tough case. Beyond simple sequence manipulation, the string methods get very complex. An ABC that included those methods would make it tough to write a compliant class that could be registered as a String. The split(), rsplit(), partition(), and rpartition() methods are examples of methods that would be difficult to emulate correctly. Also, starting with Py3.0, strings are essentially abstract sequences of code points, meaning that an encode() method is essential to being able to usefully transform them back into concrete data. Unfortunately, the encode method is so complex that it cannot be readily emulated by an aspiring string look-alike.
Besides complexity, another problem with the concrete str API is the extensive number of methods. If string look-alikes were required to emulate the likes of zfill(), ljust(), title(), translate(), join(), etc., it would significantly add to the burden of writing a class complying with the String ABC.
The fundamental problem is that of balancing a client function's desire to rely on a broad number of behaviors against the difficulty of writing a compliant look-alike class. For other ABCs, the balance is more easily struck because the behaviors are fewer in number, because they are easier to implement correctly, and because some methods can be provided as mixins. For a String ABC, the balance should lean toward minimalism due to the large number of methods and how difficult it is to implement some of the correctly.
A last reason to avoid expanding the String API is that almost none of the candidate methods characterize the notion of "stringiness". With something calling itself an integer, an add() method would be expected as it is fundamental to the notion of "integeriness". In contrast, methods like startswith() and title() are non-essential extras -- we would not discount something as being not stringlike if those methods were not present.
- Previous message: [Python-Dev] Mini-Pep: Simplifying the Integral ABC
- Next message: [Python-Dev] Mini-Pep: An Empty String ABC
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]