Poor performance when running a large statement. sqlalchemy + oracledb · Issue #172 · oracle/python-oracledb (original) (raw)
Hi,
I have the following code:
import pandas as pd from sqlalchemy.orm.session import Session
def compute(engine: Session) -> pd.DataFrame: # some steps # .... df = engine.query(...)
with engine.get_bind().connect() as conn:
df = pd.read_sql(sql=df.statement, con=conn)
return df
- If I compile SQL code from the
df
variable and then run it directly in DBeaver - this statement will complete in 17-20 seconds. And it's okay. Compiled SQL contains 347k characters. - If I run the calculation through
pd.read_sql
, then this code will run for about 16 minutes. I noticed that the delay happens in the_prepare
method (see trace below).
File "/opt/homebrew/Caskroom/miniforge/base/envs/venv/lib/python3.10/site-packages/oracledb/cursor.py", line 137, in _prepare self._impl.prepare(statement, tag, cache_statement) File "src/oracledb/impl/thin/cursor.pyx", line 213, in oracledb.thin_impl.ThinCursorImpl.prepare File "src/oracledb/impl/thin/connection.pyx", line 235, in oracledb.thin_impl.ThinConnImpl._get_statement File "src/oracledb/impl/thin/connection.pyx", line 239, in oracledb.thin_impl.ThinConnImpl._get_statement File "src/oracledb/impl/thin/statement.pyx", line 184, in oracledb.thin_impl.Statement._prepare File "/opt/homebrew/Caskroom/miniforge/base/envs/venv/lib/python3.10/re.py", line 209, in sub return _compile(pattern, flags).sub(repl, string, count)
- At the same time, I can run simpler scripts via
pd.read_sql
and they execute normally. - Also, I build the same script like in
df
variable, but for PostgreSQL. And it run for about 35 seconds.
Issue
- Is this expected behavior when working with large scripts?
- Is there any way I can bypass this
prepare
step?
requirements.txt
Apple M1 macOS Ventura 13.0
python 3.10
SQLAlchemy==2.0.9 numpy==1.23.5 pandas==1.5.2 oracledb==1.3.0
Thank you!