ENH: fix eval scoping issues by cpcloud · Pull Request #6366 · pandas-dev/pandas (original) (raw)
Relevant user-facing changes:
In [9]: df = DataFrame(randn(5, 2), columns=list('ab'))
In [10]: a, b = 1, 2
In [11]: # column names take precedence
In [12]: df.query('a < b')
Out[12]:
a b
0 -0.805549 -0.090572
1 -1.782325 -1.594079
2 -0.984364 0.934457
3 -1.963798 1.122112
[4 rows x 2 columns]
In [13]: # we must use @ whenever we want a local variable
In [14]: df.query('@a < b')
Out[14]:
a b
3 -1.963798 1.122112
[1 rows x 2 columns]
In [15]: # we cannot use @ in eval calls
In [16]: pd.eval('@a + b')
File "<string>", line unknown
SyntaxError: The '@' prefix is not allowed in top-level eval calls, please refer to your variables by name without the '@' prefix
In [17]: pd.eval('@a + b', parser='python')
File "<string>", line unknown
SyntaxError: The '@' prefix is only supported by the pandas parser
- update query/eval docstrings/indexing/perfenhancing
- make sure docs build
- release notes
- pytables
- make the
reprofScopeobjects work or revert to previous version - more tests for new local variable scoping API
- disallow (and provide a useful error message for) locals in expressions like
pd.eval('@a + b') - Raise when your variables have the same name as the builtin math functions that numexpr supports, since you cannot override them in
numexpr.evaluate, even when explicitly passing them. For example
import numexpr as ne sin = randn(10) d = {'sin': sin} result = ne.evaluate('sin > 1', local_dict=d, global_dict=d) result == array(True)
For reference, after this PR local variables are given lower precedence than column names. For example
a, b = 1, 2 df = DataFrame(randn(10, 2), columns=list('ab')) res = df.query('a > b')
will no longer raise an exception about overlapping variable names. If you want the local a (as opposed to the column a) you must do