HDFStore.select_column · Issue #17912 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

Let's select column from a non-existing dataframe in a HDFStore:

import pandas as pd

store = pd.HDFStore('test.hdf5', mode='w') store.select_column('dummy', 'index')

Problem description

We get an AttributeError because get_storer returns None:

Traceback (most recent call last):
  File "pandas_hdf5.py", line 4, in <module>
    store.select_column('dummy', 'index')
  File "[...]/site-packages/pandas/io/pytables.py", line 778, in select_column
    return self.get_storer(key).read_column(column=column, **kwargs)
AttributeError: 'NoneType' object has no attribute 'read_column'

Is this intended?

Expected Output

The docstring says:

        """
        Exceptions
        ----------
        raises KeyError if the column is not found (or key is not a valid
            store)
        raises ValueError if the column can not be extracted individually (it
            is part of a data block)
        """

Shouldn't I expect a KeyError, then?

It could be just this simple patch:

    - return self.get_storer(key).read_column(column=column, **kwargs)
    + storer = self.get_storer(key)
    + if storer is None:
    +    raise KeyError('{} not in {}'.format(key, self))
    + return storer.read_column(column=column, **kwargs)

or should get_storer raise en exception in the first place?

I'm new to Pandas/PyTables so I don't have the big picture.

From a caller perspective, I could to check first that the dataframe is in the store:

store = pd.HDFStore('test.hdf5', mode='w') if 'dummy' in store: store.select_column('dummy', 'index')

but I'd rather "ask forgiveness not permission",

store = pd.HDFStore('test.hdf5', mode='w') try: store.select_column('dummy', 'index') except AttributeError: [...]

so I should catch AttributeError but I'm not sure this exception being thrown is a design choice.

I hope I'm being constructive and I don't sound like I'm nitpicking.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8

pandas: 0.20.3
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: 0.19.1
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None