Refactor sqlalchemy code in pandas.io.sql to help prepare for sqlalchemy 2.0. by cdcadman · Pull Request #49531 · pandas-dev/pandas (original) (raw)

I am splitting this out of #48576 , because it is a major refactor of the code, with the goal of making SQLDatabase only accept a Connection and not an Engine. sqlalchemy 2.0 restricts the methods that are available to them, which makes it harder to write code that works with both. For example, Connection.connect() creates a branched connection in sqlalchemy 1.x, but is removed in 2.0, but this is called in SQLDatabase.check_case_sensitive().

I also added some clarification on how transactions work in DataFrame.to_sql, based on this example, run against pandas 1.5.1:

import sqlite3
from pandas import DataFrame
from sqlalchemy import create_engine

with sqlite3.connect(":memory:") as con:
    con.execute("create table test (A integer, B integer)")
    row_count = con.execute("insert into test values (2, 4), (5, 10)").rowcount
    if row_count > 1:
        con.rollback()
    print(con.execute("select count(*) from test").fetchall()[0][0]) # prints 0

with sqlite3.connect(":memory:") as con:
    con.execute("create table test (A integer, B integer)")
    row_count = DataFrame({'A': [2, 5], 'B': [4, 10]}).to_sql('test', con, if_exists='append', index=False)
    if row_count > 1:
        con.rollback() # does nothing, because pandas already committed the transaction.
    print(con.execute("select count(*) from test").fetchall()[0][0]) # prints 2
    
with create_engine("sqlite:///:memory:").connect() as con:
    with con.begin():
        con.exec_driver_sql("create table test (A integer, B integer)")
    try:
        with con.begin():
            row_count = DataFrame({'A': [2, 5], 'B': [4, 10]}).to_sql('test', con, if_exists='append', index=False)
            assert row_count < 2
    except AssertionError:
        pass
    print(con.execute("select count(*) from test").fetchall()[0][0]) # prints 0