String methods get() does not insert NaNs on negative index out of range · Issue #17704 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas as pd df1 = pd.DataFrame([["1_2_3_4_5"], ["6_7_8_9_10"], ["11_12"]], columns=["test"])
access using positive index
df1["test"].str.split("_").str[2]
output:
0 3
1 8
2 NaN
access using negative index
df1["test"].str.split("_").str[-3]
output:
IndexError: list index out of range
Problem description
Since accessing an array using a negative index is supported if all items in the Series are the correct length for it, and if you access using a positive index that is out of range for any item in the series instead outputs a NaN for that item, I would think that the behaviour should be unified and that access using a negative index also outputs NaNs if an item in the Series is shorter than expected.
Expected Output
df1["test"].str.split("_").str[-3]
output:
0 3
1 8
2 NaN
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-132-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 36.2.7
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.1.14
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None