tarfile sets FNAME field to the path given by user: Lib/tarfile.py:424 It writes full path instead of just basename if user specified absolute path. Some archive viewer apps like 7-Zip may process file incorrectly. Also it creates security issue because anyone can know structure of directories on system and know username or other personal information. You can reproduce this by running below lines in Python interpreter. Tested on Windows and Linux. Python 3.8.2 (default, Apr 27 2020, 15:53:34) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> import tarfile >>> open("somefile.txt", "w").write("sometext") 8 >>> tar = tarfile.open("/home/bulgakovas/file.tar.gz", "w|gz") >>> tar.add("somefile.txt") >>> tar.close() >>> open("file.tar.gz", "rb").read()[:50] b'\x1f\x8b\x08\x08cE\x10_\x02\xff/home/bulgakovas/file.tar\x00\xed\xd3M\n\xc20\x10\x86\xe1\xac=EO\x90' You can see full path to file.tar (/home/bulgakovas/file.tar) as FNAME field. If you will write just tarfile.open("file.tar.gz", "w
gz"), FNAME will be equal to file.tar. RFC1952 says about FNAME: This is the original name of the file being compressed, with any directory components removed. So tarfile must remove directory names from FNAME and write only basename of file.
Hi, If I understand correctly, the name that you are using into the tar is the basename of the file. I didn't test it yet, but this PR will remove the possibility to create a file into the tar using the source tree folder? Maybe we can think about implement a parameter seems like arcname on Zipfile? What about that? Cheers!
Hi. My PR doesn't remove the possibility to add tree into tar file. It only fixes header for GZIP compression. Any data after this header is not affected. You can test it by creating two archives with the same data but one with my patch and the second without. All bytes after header are equal.