Issue 45649: Add tarinfo.Path - Python tracker (original) (raw)

Created on 2021-10-28 15:59 by FFY00, last changed 2022-04-11 14:59 by admin.

Messages (6)
msg405194 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-10-28 15:59
It would be helpful to have a pathlib-compatible object in tarfile, similarly to zipfile.Path.
msg405206 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-10-28 17:23
I vaguely recall exploring this concept and finding that tarfiles don’t supply the requisite interface because they’re not random access. I’m only 10% confident in that recollection, so worth exploring.
msg405210 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-10-28 18:08
That is good to know. This isn't very high on my priority list, but I will try to explore when I have some time.
msg409479 - (view) Author: Barney Gale (barneygale) * Date: 2022-01-01 21:34
It's possible to do, but will be a little slow due to the nature of tar files. They're a big linked list of files, so you need to do a bunch of reads/seeks from the start to the end to enumerate all files. I'd ask that we try to get solved first. That would let us write: # tarfile.py class Path(pathlib.AbstractPath): def iterdir(self): ... def stat(self): ... We'd fill in a smallish number of abstract methods to get a full `Path`-compatible class with `read_text()`, `is_symlink()` etc methods.
msg409481 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2022-01-01 23:19
I'd recommend not to block on . It's not obvious to me that subclassing would be valuable. It depends on how it's implemented, but in my experience, zipfile.Path doesn't and cannot implement the full interface of pathlib.Path. Instead zipfile.Path attempts to implement a protocol. At the time, the protocol was undefined, but now there exists importlib.resources.abc.Traversable (https://docs.python.org/3/library/importlib.html#importlib.abc.Traversable), the interface needed by importlib.resources. I'd honestly just create a standalone class, see if it can implement Traversable, and only then consider if it should implement a more complicated interface (such as something with symlink support or perhaps even later subclassing from pathlib.Path).
msg409482 - (view) Author: Barney Gale (barneygale) * Date: 2022-01-02 00:01
If you're only aiming for Traversable compatibility, sure. The original bug description asks for something that's pathlib-compatible and similar to zipfile.Path, which goes beyond the Traversable interface in attempting to emulate pathlib.Path. The pathlib.Path interface is a good one - I see no reason it can't apply to zip and tar archives in full. Methods of Path objects already raise NotImplementedError if operations aren't supported (e.g. creating symlinks) Some prototyping from a couple years back, including a tar path implementation: https://github.com/barneygale/pathlab/tree/master/pathlab
History
Date User Action Args
2022-04-11 14:59:51 admin set github: 89812
2022-01-02 00:01:26 barneygale set messages: +
2022-01-01 23:19:11 jaraco set messages: +
2022-01-01 21:34:37 barneygale set nosy: + barneygalemessages: +
2021-10-28 20:20:39 ethan.furman set nosy: + ethan.furman
2021-10-28 18:08:11 FFY00 set messages: +
2021-10-28 17:23:48 jaraco set messages: +
2021-10-28 15:59:21 FFY00 create