DataFrame.apply adds a frequency to a freq=None DatetimeIndex as a side-effect · Issue #22150 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

import numpy as np, pandas as pd

def sudden_frequency(num_columns):
    index = pd.DatetimeIndex(["1950-06-30", "1952-10-24", "1953-05-29"])
    columns = list(range(num_columns))
    df = pd.DataFrame(np.random.random((len(index), num_columns)), index, columns)
    df.apply(lambda sr: sr)
    return index

for num_columns in range(5):
    print(num_columns, "--", sudden_frequency(num_columns))

Output:

0 -- DatetimeIndex(['1950-06-30', '1952-10-24', '1953-05-29'], dtype='datetime64[ns]', freq=None)
1 -- DatetimeIndex(['1950-06-30', '1952-10-24', '1953-05-29'], dtype='datetime64[ns]', freq=None)
2 -- DatetimeIndex(['1950-06-30', '1952-10-24', '1953-05-29'], dtype='datetime64[ns]', freq='WOM-4FRI')
3 -- DatetimeIndex(['1950-06-30', '1952-10-24', '1953-05-29'], dtype='datetime64[ns]', freq='WOM-4FRI')
4 -- DatetimeIndex(['1950-06-30', '1952-10-24', '1953-05-29'], dtype='datetime64[ns]', freq='WOM-4FRI')

Problem description

This particular index (found by hypothesis) suddenly gains a frequency it is used in a DataFrame, with >= 2 columns, which goes on to call ".apply".

Expected Output

n -- DatetimeIndex(['1950-06-30', '1952-10-24', '1953-05-29'], dtype='datetime64[ns]', freq=None)

for all n.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-514.16.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C
LOCALE: None.None

pandas: 0.23.3
pytest: 3.6.4
pip: 18.0
setuptools: 39.2.0
Cython: None
numpy: 1.15.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.10
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None