Bug in generate CREATE TABLE SQL for DataFrame with MultiIndex with different types · Issue #8021 · pandas-dev/pandas (original) (raw)

When using SQLAlchemy to create a table from a dataframe, Pandas messes up the types of the index columns. For example:

import sqlalchemy engine = sqlalchemy.create_engine('sqlite:///:memory:')

import pandas as pd from pandas.io.sql import PandasSQLTable, PandasSQLAlchemy df = pd.DataFrame.from_records([{'a':1,'b':2.1,'c':'txt1','a':2,'b':1.5,'c':'txt2'}], index=['a','b']) o = PandasSQLTable('test_df',PandasSQLAlchemy(engine),frame=df) print o.sql_schema()

outputs

CREATE TABLE test_df (
    a FLOAT, 
    b BIGINT, 
    c TEXT
)

when a should be BIGINT and b should be FLOAT.

This occurs because the index columns are enumerated backwards in PandasSQLTable._create_table_statement in pandas.io.sql but the types don't take into account this backward iteration:

    if self.index is not None:
        for i, idx_label in enumerate(self.index[::-1]):
            idx_type = self._sqlalchemy_type(
                self.frame.index.get_level_values(i))
            columns.insert(0, Column(idx_label, idx_type, index=True))

I will make a pull request with a fix for this issue. Versions appended below:

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-358.6.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.14.1-200-g534784b
nose: 1.3.0
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 2.1.0
sphinx: 1.2.2
patsy: 0.2.1
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.4
bottleneck: None
tables: None
numexpr: 2.4
matplotlib: 1.3.1
openpyxl: None
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: 2.4.2
sqlalchemy: 0.9.7
pymysql: None
psycopg2: None