[Python-Dev] Toowtdi: Datatype conversions (original) (raw)
Raymond Hettinger python at rcn.com
Sat Jan 3 20:50:13 EST 2004
- Previous message: [Python-Dev] Toowtdi: Datatype conversions
- Next message: [Python-Dev] Toowtdi: Datatype conversions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Raymond Hettinger]
> Which is the one obvious way of turning a dictionary into a list?
[Martin v. Loewis]
There is no obvious way to turn a dictionary into a list; lists and dictionaries are completely different things.
[Guido]
Ugh. I really hope we aren't going to teach people to write list(d) instead of d.keys().
Put another way, which is preferable:
list(d.iterkeys()) vs. d.keys() list(d.itervalues()) vs. d.values() list(d.iteritems()) vs. d.items()
[Martin]
Dictionaries are similar to sets; in fact, Smalltalk as an Association
class (Assocation key: value:), and Dictionary is a set of associations.
Your perceptiveness is uncanny! This issue arose for me while writing an extension module that implements Smalltalk bags which do have meaningful conversions to sets, lists, and dicts:
list(b) ['dog', 'dog', 'cat'] dict(b.iterwithcounts) {'dog':2, 'cat':1} set(b) set(['dog', 'cat']) list(b.iterunique) ['dog', 'cat']
The alternative API is:
bag.asList() bag.asDict() bag.asSet() bag.unique()
I'm trying to decide which API is the cleanest and has reasonable performance.
The former has two fewer methods. The latter has much better performance but won't support casts to subclasses of list/set/dict.
[Raymond]
> So, one question is whether set() and frozenset() should grow an > analogue to the keys() method:
[Robert Brewer]
I don't think so, for the reason that .keys() is effectively a disambiguator as I just described. With sets, there is no mapping, and therefore no ambiguity.
[Martin]
But this is a completely different issue! For sets, there is an obvious way (if you accept that the list will have the same elements in arbitrary order), then
list(aset) is the most obvious way, and it should work fastest.
Right! Unfortunately, it can never be as fast as a set.elements() method -- the underlying d.keys() method has too many advantages (looping with an in-lined version of PyDict_Next(), knowing the size of the dictionary, writing with PyList_SET_ITEM, and having all steps in-lines with no intervening function calls).
> Another bright idea is to support faster datatype conversion by > adding an optional len() method to the iteration protocol so > that list(), tuple(), dict(), and set() could allocate sufficient > space for loading any iterable that knows its own length.
[Martin]
That is useful, also for list comprehension.
> The advantages are faster type conversion (by avoiding resizing), > keeping the APIs decoupled, and keeping the visible API thin. This > disadvantage is that it clutters the C code with special case > handling and that it doesn't work with generators or custom > iterators (unless they add support for len). I see no reason why it should not work for custom iterators. For generators, you typically don't know how many results you will get in the end, so it is no loss that you cannot specify that.
That makes sense.
Looking at the code for list_fill, I see that some length checking is already done but only if the underlying object fills sq_length. I think that check should be replaced by a call to PyObject_Size().
That leaves a question as to how to best empower the dictionary constructor. If the source has an underlying dictionary (a Bag is a good example), then nothing beats PyDict_Copy(). For sets, that only works if you accept the default value of True.
The set constructor has the same issue when the length of the iterable is knowable. The problem is that there is no analogue to PyList(n) which returns a presized collection.
On a separate issue, does anyone care that dict.init() has update behavior instead of replace behavior like list.init()?
Raymond
- Previous message: [Python-Dev] Toowtdi: Datatype conversions
- Next message: [Python-Dev] Toowtdi: Datatype conversions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]