ENH: fix eval scoping issues by cpcloud · Pull Request #6366 · pandas-dev/pandas (original) (raw)

closes #5987
closes #5087

Relevant user-facing changes:

In [9]: df = DataFrame(randn(5, 2), columns=list('ab'))

In [10]: a, b = 1, 2

In [11]: # column names take precedence

In [12]: df.query('a < b')
Out[12]:
          a         b
0 -0.805549 -0.090572
1 -1.782325 -1.594079
2 -0.984364  0.934457
3 -1.963798  1.122112

[4 rows x 2 columns]

In [13]: # we must use @ whenever we want a local variable

In [14]: df.query('@a < b')
Out[14]:
          a         b
3 -1.963798  1.122112

[1 rows x 2 columns]

In [15]: # we cannot use @ in eval calls

In [16]: pd.eval('@a + b')
  File "<string>", line unknown
SyntaxError: The '@' prefix is not allowed in top-level eval calls, please refer to your variables by name without the '@' prefix

In [17]: pd.eval('@a + b', parser='python')
  File "<string>", line unknown
SyntaxError: The '@' prefix is only supported by the pandas parser

import numexpr as ne sin = randn(10) d = {'sin': sin} result = ne.evaluate('sin > 1', local_dict=d, global_dict=d) result == array(True)

For reference, after this PR local variables are given lower precedence than column names. For example

a, b = 1, 2 df = DataFrame(randn(10, 2), columns=list('ab')) res = df.query('a > b')

will no longer raise an exception about overlapping variable names. If you want the local a (as opposed to the column a) you must do