DataFrame.fillna() working on row vector instead of column vector? · Issue #15522 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

df.head(5) time id bid bid_depth bid_depth_total
0 2017-02-27 11:34:31+00:00 105 148.0 497.0 216589.0 1 2017-02-27 11:34:35+00:00 105 NaN NaN NaN 2 2017-02-27 11:34:38+00:00 105 NaN NaN NaN 3 2017-02-27 11:34:40+00:00 105 NaN NaN NaN 4 2017-02-27 11:34:41+00:00 105 NaN NaN NaN

bid_number offer offer_depth offer_depth_total offer_number open
0 243.0 148.1 14192.0 530373.0 503.0 147.5 1 NaN NaN 14272.0 530453.0 504.0 NaN 2 NaN NaN 14192.0 530373.0 503.0 NaN 3 NaN NaN 14272.0 530453.0 504.0 NaN 4 NaN NaN 14492.0 530673.0 505.0 NaN

high    low   last  change  change_percent     volume        value  trades

0 148.2 147.3 148.0 0.9 0.61 1286830.0 190224000.0 2112.0 1 NaN NaN NaN NaN NaN NaN NaN NaN 2 NaN NaN NaN NaN NaN NaN NaN NaN 3 NaN NaN NaN NaN NaN NaN NaN NaN 4 NaN NaN NaN NaN NaN NaN NaN NaN

df.fillna(method='pad') Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.6/site-packages/pandas/core/frame.py", line 2842, in fillna downcast=downcast, **kwargs) File "/usr/lib/python3.6/site-packages/pandas/core/generic.py", line 3250, in fillna downcast=downcast) File "/usr/lib/python3.6/site-packages/pandas/core/internals.py", line 3177, in interpolate return self.apply('interpolate', **kwargs) File "/usr/lib/python3.6/site-packages/pandas/core/internals.py", line 3056, in apply applied = getattr(b, f)(**kwargs) File "/usr/lib/python3.6/site-packages/pandas/core/internals.py", line 917, in interpolate downcast=downcast, mgr=mgr) File "/usr/lib/python3.6/site-packages/pandas/core/internals.py", line 956, in _interpolate_with_fill values = self._try_coerce_result(values) File "/usr/lib/python3.6/site-packages/pandas/core/internals.py", line 2448, in _try_coerce_result result = result.reshape(len(result)) ValueError: cannot reshape array of size 24311 into shape (1,)

Problem description

msgpack of dataframe for replication:
https://www.dropbox.com/s/5skf6v8x2vg103o/dataframe?dl=0

I'm a beginner so I can only guess at what is wrong, but it seems to be working on rows instead of the columns. I can loop through df.columns and do it series by series to end up with the expected output so it doesn't seem to me as if it is a problem with any of the columns.

Expected Output

Fill the columns of NaN's with prior value in column.

Output of pd.show_versions()

commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 4.9.8-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.2.0
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: None
boto: None
pandas_datareader: None