BUG: DataFrame constructor ignores copy=True argument if dtype is set · Issue #9099 · pandas-dev/pandas (original) (raw)

I just noticed that DataFrame constructor ignores the copy=True argument if dtype is set. In the code snippet below, the orig dataframe should stay unmodified after any modification of new1 and new2. Instead, the columns of new2 (or at least the first one as shown in the snippet) are references to the same data, as highlighted by the modification shown on statement 13 and onwards.

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: pd.show_versions()

INSTALLED VERSIONS

commit: None python: 2.7.8.final.0 python-bits: 64 OS: Linux OS-release: 3.16.0-25-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8

pandas: 0.15.2 nose: 1.3.4 Cython: 0.21.1 numpy: 1.9.1 scipy: 0.14.0 statsmodels: 0.4.3 IPython: 2.3.1 sphinx: 1.1.2 patsy: 0.3.0 dateutil: 2.3 pytz: 2014.10 bottleneck: 0.6.0 tables: None numexpr: 2.4 matplotlib: 1.3.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999 httplib2: 0.7.4 apiclient: None rpy2: None sqlalchemy: 0.9.7 pymysql: None psycopg2: None

In [4]: orig_data = { ...: 'col1': [1.], ...: 'col2': [2.], ...: 'col3': [3.],}

In [5]: orig = pd.DataFrame(orig_data)

In [6]: new1 = pd.DataFrame(orig, copy=True)

In [7]: new2 = pd.DataFrame(orig, dtype=float, copy=True)

In [8]: new1 Out[8]: col1 col2 col3 0 1 2 3

In [9]: new2 Out[9]: col1 col2 col3 0 1 2 3

In [10]: new1['col1'] = 100.

In [11]: new1 Out[11]: col1 col2 col3 0 100 2 3

In [12]: orig Out[12]: col1 col2 col3 0 1 2 3

In [13]: new2['col1'] = 200.

In [14]: new2 Out[14]: col1 col2 col3 0 200 2 3

In [15]: orig Out[15]: col1 col2 col3 0 200 2 3

In [16]: