DEPR: pandas.io.sql.execute · Issue #50185 · pandas-dev/pandas (original) (raw)

There is currently a bug in pandas.io.sql.execute on the main branch (not in any released versions). I created the bug in #49531 while working on #48576. In #49531, I changed SQLDatabase to only accept a Connection and not an Engine, because sqlalchemy 2.0 restricts the methods that are available to Engine. #49531 created bugs in both read_sql and pandas.io.sql.execute, and I have an open PR to fix read_sql in #49967.

This reproduces the bug in pandas.io.sql.execute:

from pathlib import Path
from sqlalchemy import create_engine
from pandas import DataFrame
from pandas.io import sql

db_file = Path(r"C:\Temp\test.db")
db_file.unlink(missing_ok=True)
db_uri = "sqlite:///" + str(db_file)
engine = create_engine(db_uri)
DataFrame({'a': [2, 4], 'b': [3, 6]}).to_sql('test_table', engine)
sql.execute('select * from test_table', engine).fetchall()

Traceback when running on the main branch in pandas:

Error closing cursor
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\cursor.py", line 991, in fetchall
    rows = dbapi_cursor.fetchall()
sqlite3.ProgrammingError: Cannot operate on a closed database.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\base.py", line 1995, in _safe_close_cursor
    cursor.close()
sqlite3.ProgrammingError: Cannot operate on a closed database.
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\cursor.py", line 991, in fetchall
    rows = dbapi_cursor.fetchall()
sqlite3.ProgrammingError: Cannot operate on a closed database.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Github\pandas\execute_error.py", line 11, in <module>
    sql.execute('select * from test_table', engine).fetchall()
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\result.py", line 1072, in fetchall
    return self._allrows()
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\result.py", line 401, in _allrows
    rows = self._fetchall_impl()
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\cursor.py", line 1819, in _fetchall_impl
    return self.cursor_strategy.fetchall(self, self.cursor)
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\cursor.py", line 995, in fetchall
    self.handle_exception(result, dbapi_cursor, e)
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\cursor.py", line 955, in handle_exception
    result.connection._handle_dbapi_exception(
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\base.py", line 2124, in _handle_dbapi_exception
    util.raise_(
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\util\compat.py", line 211, in raise_
    raise exception
  File "C:\Program Files\Python310\lib\site-packages\sqlalchemy\engine\cursor.py", line 991, in fetchall
    rows = dbapi_cursor.fetchall()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database.
(Background on this error at: https://sqlalche.me/e/14/f405)

When running on pandas 1.5.2, there is no error.

After #49967, SQLDatabase accepts str, Engine, and Connection, but in SQLDatabase.__init__, a Connection is created if it was not passed in, and an ExitStack is created to clean up the Connection and Engine if necessary. Cleanup happens either in SQLDatabase.__exit__ or at the end of the generator which is returned by read_sql if chunksize is not None. What I'm not able to see is how to do the cleanup after the caller fetches the results of pandas.io.sql.execute. I have come up with two options so far: