ENH: #10143 Function to walk the group hierarchy of a PyTables HDF5 file by stephenpascoe · Pull Request #10932 · pandas-dev/pandas (original) (raw)
I've renamed walk() and included Series objects. However, I think you have a different API in mind. I am deliberately not yielding each Pandas object individually but yielding each PyTables group name, along with a tuple of its contents. This follows the os.walk API. I.e. each yield is
(group_path, [subgroup_name, ...], [subobj_name, ...])
I think there are several advantages:
- The consumer can see the difference between groups and Pandas objects
- Future extension could allow pruning of the search space by mutating the yielded lists, as is possible with os.walk.
Note also:
- Some testing of node type is necessary during walk because a Pandas object is also a group to PyTables.
- All non-pandas leaves are ignored. walk() will only yield groups and Pandas objects.
Please let me know what you think before I write something in whatsnew/0.17.0