BUG: Expression xxxx has forbidden control characters - caused by new release of numexpr · Issue #54542 · pandas-dev/pandas (original) (raw)

Pandas version checks

Reproducible Example

$ pip install numexpr=2.8.5

import pandas as pd df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD')) a = 8 df.query("A == a@", engine="numexpr")

Issue Description

Traceback (most recent call last): File "C:\Users\QW664QA\Miniconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3433, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in df.query("A == @a", engine="numexpr") File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\frame.py", line 4060, in query res = self.eval(expr, **kwargs) File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\frame.py", line 4191, in eval return _eval(expr, inplace=inplace, **kwargs) File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\computation\eval.py", line 353, in eval ret = eng_inst.evaluate() File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\computation\engines.py", line 80, in evaluate res = self._evaluate() File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\computation\engines.py", line 121, in _evaluate return ne.evaluate(s, local_dict=scope) File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 943, in evaluate raise e File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 851, in validate _names_cache[expr_key] = getExprNames(ex, context) File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 714, in getExprNames ex = stringToExpression(text, {}, context) File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 274, in stringToExpression raise ValueError(f'Expression {s} has forbidden control characters.') ValueError: Expression (A) == (__pd_eval_local_a) has forbidden control characters.

So this is actually an issue with numexpr release 2.8.5 which went live on Sunday 6th August 2023:

Not sure if this qualifies as a bug over there, but it breaks pandas if you have numexpr==2.8.5 installed

Expected Behavior

df.query("A == 8", engine="numexpr")

correctly queries the df and produces a valid response. So this is an issue with using @ variables in the query which produces those dunder variables, although I guess it may manifest elsewhere.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 66e3805python : 3.9.12.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19044 machine : AMD64 processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United Kingdom.1252 pandas : 1.3.5 numpy : 1.24.3 pytz : 2021.3 dateutil : 2.8.2 pip : 21.2.4 setuptools : 61.2.0 Cython : 3.0.0 pytest : None hypothesis : None sphinx : 6.1.3 blosc : None feather : None xlsxwriter : 3.0.3 lxml.etree : 4.9.1 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.7.0 pandas_datareader: 0.10.0 bs4 : 4.11.1 bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.6.2 numexpr : 2.8.5 odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : 11.0.0 pyxlsb : None s3fs : None scipy : 1.9.3 sqlalchemy : None tables : None tabulate : 0.9.0 xarray : None xlrd : 2.0.1 xlwt : None numba : None