Issue 1583880: tarfile.py: better use of TarInfo objects with longnames (original) (raw)
When a TarInfo object with a long name is added to a TarFile, the .name attribute is garbled during the special processing involved with long names. This is true for both posix and gnu mode and has historical "design" reasons.
In posix mode, a long name is split in two. TarInfo's prefix attr gets the first part, the name attr the second one. In gnu mode, a long name is truncated up to LENGTH_NAME (100) chars and stored the TarInfo's name attr.
So, if you open a TarFile for writing, add a few files with long names to it and call the getnames() method for that still open file, the names returned are all cut. The getmember() method will not work, because all names have changed.
On top of that, if a user adds a TarInfo object to a TarFile it is not copied. So, it is undefined what happens if the user uses the same TarInfo object several times with changed attributes. The problem described in bug #1583537 (now deleted) was partly caused by this.
The attached patch makes it possible to use the same TarInfo object several times by copying it in TarFile.addfile(), removes the (undocumented) TarInfo.prefix attr and leaves TarInfo.name alone.
I think this should be backported to 2.5 as well.