BUG: Make dict iterators real iterators, provide "next()" in Python 2 by toobaz · Pull Request #12299 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation12 Commits2 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

toobaz

Currently, compat.itervalues and friends are not real iterators (under Python 3): compare

In [3]: next(six.itervalues({1:2}))
Out[3]: 2

with

In [4]: next(pd.compat.itervalues({1:2}))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-177f5d3e9571> in <module>()
----> 1 next(pd.compat.itervalues({1:2}))

TypeError: 'dict_values' object is not an iterator

This PR fixes this (it drops support for **kwargs: although it is supported by six, I fail to see what's the utility, and it's anyway apparently not used in the pandas codebase - but if I'm missing something, it is trivial to reintroduce it) and provides next() in Python 2.

@jreback

Why don't we just use the six code almost directly?

if PY3:
    def iterkeys(d, **kw):
        return iter(d.keys(**kw))

    def itervalues(d, **kw):
        return iter(d.values(**kw))

    def iteritems(d, **kw):
        return iter(d.items(**kw))

    def iterlists(d, **kw):
        return iter(d.lists(**kw))

    viewkeys = operator.methodcaller("keys")

    viewvalues = operator.methodcaller("values")

    viewitems = operator.methodcaller("items")
else:
    def iterkeys(d, **kw):
        return d.iterkeys(**kw)

    def itervalues(d, **kw):
        return d.itervalues(**kw)

    def iteritems(d, **kw):
        return d.iteritems(**kw)

    def iterlists(d, **kw):
        return d.iterlists(**kw)

    viewkeys = operator.methodcaller("viewkeys")

    viewvalues = operator.methodcaller("viewvalues")

    viewitems = operator.methodcaller("viewitems")

@toobaz

Makes sense, except that you could not use anymore compat.iteritems(NDFrame), given that NDFrame.iter is not callable (while the six Python 3 version expects it to be so, since dict.iter is).

But still the right thing to do is probably to (also) change each compat.iteritems(NDFrame) to iter(NDFrame.iteritems()). There aren't tons of them.

(then, I still fail to understand the **kw thing, but never mind)

@jreback

yeh the compat was done a long time ago, its possible the actual six code was changed since then (to be better) and we have the original. ok with these kinds of changes (e.g. they do a slightly better check for next compat)

@kawochen

the keyword argument passing is needed to generate the correct error messages.

@toobaz

... I still don't understand: which error messages?

@kawochen

when you pass a keyword argument, the real function gets to tell you that it doesn't need any. the other error messages still are wrong tho. but these things are not super important

@toobaz

@toobaz

Above, I clearly made a mistake writing NDFrame.iter rather than NDFrame.items, and anyway it is not NDFrame, but only Series and DataFrame.

That said, I ended up

What I didn't do but may be worth doing in the medium/long run:

It is true that one driver of the decision would be to avoid problems with features (Panel, in which items is an axis) which will be probably deprecated... but I don't see any downside (apart from a minor one: we are sticking to Python 2-sounding method names).

@toobaz

Is anything expected from me here?

@jreback

write out all of the iter methods there strongly a couple instead of evaluating them at run time

@toobaz

@toobaz

@toobaz

@jreback