[Python-Dev] Iterable String Redux (aka String ABC) (original) (raw)
Mike Klaas mike.klaas at gmail.com
Thu May 29 04:59:02 CEST 2008
- Previous message: [Python-Dev] Iterable String Redux (aka String ABC)
- Next message: [Python-Dev] Iterable String Redux (aka String ABC)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 28-May-08, at 5:44 PM, Greg Ewing wrote:
Mike Klaas wrote:
In my perfect world, strings would be indicable and sliceable, but not iterable. An object that was indexable but not iterable would be a very strange thing. If it has len and getitem, there's nothing to stop you iterating over it by hand anyway, so disallowing iter would just seem perverse.
Python has a beautiful abstraction in iteration: iter() is a generic
function that allows you lazily consume a sequence of objects, whether
it be lists, tuples, custom iterators, generators, or what have you.
It is trivial to write your code to be agnostic to the type of
iterable passed-in. Almost anything else a consumer of your code
passes in will result in an immediate exception.
Unfortunately, python has two extremely common data types which do not
fail when this generic function is applied to them, and instead almost
always returns a result which is not desired. Instead, it iterates
over the characters of the string, a behaviour which is rarely needed
in practice due to the wealth of methods available.
I agree that it would be perverse to disallowing iterating over a
string. I just wish that the way to do that wasn't glommed on to the
object-iteration abstraction.
As it stands, any consumer of iterables has to keep strings in mind.
It is particularly irksome when the target input is an iterable of
strings. I recall a function that accepts a list/iterable of item
keys, hashes them, and then retrieves values based on the item hashes
(usually over the network, so it is necessary to batch requests).
This function is often used in the interactive interpreter, and it is
thus very prone to being passed-in a string rather than a list. There
was no good way to prevent the (frequent) mysterious "not found"
errors save adding an explicit type check for basestring.
String already behaves slightly differently from the way other
sequences act: It is the only sequence for which 'seq in seq' is
true, and the only sequence for which 'x in seq' can be true but
'any(x==item for item in seq)' is false. Abstractions are sometimes
imperfect: this is why there is an explicit typecheck for strings in
the sum() builtin.
I'll stop here as I realize that the likelihood that this will be
accepted is terribly small, especially considering the late stage of
the process. But I would be willing to develop a patch that
implements this behaviour on the off chance it is.
-Mike
- Previous message: [Python-Dev] Iterable String Redux (aka String ABC)
- Next message: [Python-Dev] Iterable String Redux (aka String ABC)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]