ENH/API: Change query/eval local variable API · Issue #5987 · pandas-dev/pandas (original) (raw)
Currently, with query
and eval
you can use local variables a la the @
symbol. It's a bit confusing since you're not allowed to have a local variable and a column name with the same name, but it will try to pull the local if possible.
Current API:
Fails with a NameError
:
a = 1 df = DataFrame({'a': randn(10), 'b': randn(10)}) df.query('a > b')
But this works:
And so does this, which is confusing:
a = 1 df = DataFrame({'b': randn(10), 'c': randn(10)}) df.query('a < b < c')
As suggested by @y-p and @jreback, the following API is less confusing IMO.
From now on, all local variables will need an explicit reference and if there is a column name and a local with the same name then the column will be used. Thus you can always be sure that you're referring to a column, or it doesn't exist, in which case you'll get an error. And if you use @
then you can be sure that you're referring to local, and likewise get an error if it doesn't exist. As a bonus ( 🐺 in 🐑 's clothing), this allows you to use both a local and a column name with the same name.
Examples:
a = 1 df = DataFrame({'a': randn(10), 'b': randn(10)})
uses the column 'a'
df.query('a > b')
uses the local
df.query('@a > b')
fails because I didn't reference the local and there's no 'c' column
c = 1 df.query('a > c')
local and a column name
df.query('b < @a < a')