Issue 21996: gettarinfo method does not handle files without text string names (original) (raw)

It looks like if you pass a “fileobj” argument to “gettarinfo”, it assumes it can use the “name” as a text string.

import tarfile with tarfile.open("/dev/null", "w") as tar, open("/bin/sh", "rb") as file: tar.gettarinfo(fileobj=file) ... <TarInfo 'bin/sh' at 0x7f13cc937f20> with tarfile.open("/dev/null", "w") as tar, open(b"/bin/sh", "rb") as file: tar.gettarinfo(fileobj=file) ... Traceback (most recent call last): File "", line 1, in File "/media/disk/home/proj/python/cpython/Lib/tarfile.py", line 1767, in gettarinfo arcname = arcname.replace(os.sep, "/") TypeError: expected bytes, bytearray or buffer compatible object with tarfile.open("/dev/null", "w") as tar, open(0, "rb", closefd=False) as file: tar.gettarinfo(fileobj=file) ... Traceback (most recent call last): File "", line 1, in File "/media/disk/home/proj/python/cpython/Lib/tarfile.py", line 1766, in gettarinfo drv, arcname = os.path.splitdrive(arcname) File "Lib/posixpath.py", line 133, in splitdrive return p[:0], p TypeError: 'int' object is not subscriptable

In my case, my code always sets the final TarInfo.name attribute later on, so the initial name does not matter. Perhaps at least the documentation should say that “fileobj.name” must be a real unencoded file name string unless “arcname” is also given. My workaround was to add a dummy arcname argument, a bit like this:

Explicit dummy name to avoid using file name of bytes

tarinfo = self.tar.gettarinfo(fileobj=file, arcname="")

. . .

tarinfo.name = "{}/{}".format(self.pkgname, name)