BUG: DataFrame constructor ignores copy=True argument if dtype is set · Issue #9099 · pandas-dev/pandas (original) (raw)
I just noticed that DataFrame
constructor ignores the copy=True
argument if dtype
is set. In the code snippet below, the orig
dataframe should stay unmodified after any modification of new1
and new2
. Instead, the columns of new2
(or at least the first one as shown in the snippet) are references to the same data, as highlighted by the modification shown on statement 13 and onwards.
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: pd.show_versions()
INSTALLED VERSIONS
commit: None python: 2.7.8.final.0 python-bits: 64 OS: Linux OS-release: 3.16.0-25-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8
pandas: 0.15.2 nose: 1.3.4 Cython: 0.21.1 numpy: 1.9.1 scipy: 0.14.0 statsmodels: 0.4.3 IPython: 2.3.1 sphinx: 1.1.2 patsy: 0.3.0 dateutil: 2.3 pytz: 2014.10 bottleneck: 0.6.0 tables: None numexpr: 2.4 matplotlib: 1.3.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999 httplib2: 0.7.4 apiclient: None rpy2: None sqlalchemy: 0.9.7 pymysql: None psycopg2: None
In [4]: orig_data = { ...: 'col1': [1.], ...: 'col2': [2.], ...: 'col3': [3.],}
In [5]: orig = pd.DataFrame(orig_data)
In [6]: new1 = pd.DataFrame(orig, copy=True)
In [7]: new2 = pd.DataFrame(orig, dtype=float, copy=True)
In [8]: new1 Out[8]: col1 col2 col3 0 1 2 3
In [9]: new2 Out[9]: col1 col2 col3 0 1 2 3
In [10]: new1['col1'] = 100.
In [11]: new1 Out[11]: col1 col2 col3 0 100 2 3
In [12]: orig Out[12]: col1 col2 col3 0 1 2 3
In [13]: new2['col1'] = 200.
In [14]: new2 Out[14]: col1 col2 col3 0 200 2 3
In [15]: orig Out[15]: col1 col2 col3 0 200 2 3
In [16]: