[Python-Dev] Request for Pronouncement: PEP 441 (original) (raw)

[Python-Dev] Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

Brett Cannon brett at python.org
Mon Feb 23 22:02:09 CET 2015


On Mon Feb 23 2015 at 3:51:18 PM Paul Moore <p.f.moore at gmail.com> wrote:

On 23 February 2015 at 18:40, Brett Cannon <brett at python.org> wrote: > Couldn't you just keep it in memory as bytes and then write directly over > the file? I realize that's a bit wasteful memory-wise but it is possible. > The docs could mention the memory cost is something to watch out for when > doing an in-place replacement. Heck the code could even make it an > io.BytesIO instance so the rest of the code doesn't have to care about this > special case.

The real problem with overwriting is if there's a failure during the overwrite you lose the original file. My original API had overwrite as the default, but I think the risk makes that a bad idea.

Couldn't you catch the exception, write the original file back out, and then re-raise the exception?

One option would be to allow outputs (TARGET in pack() and NEWARCHIVE in setinterpreter()) to be open files (open for write in bytes mode) as well as filenames[1]. Then the caller has the choice of how to manage the output. The docs could include an example of overwriting via a BytesIO object, and point out the risk.

That sounds like a good idea. No reason to do the file opening on someone's behalf when opening files is so easy and keeps layering abstractions at a good level. Would this extend also to the archive being read to be consistent?

I should mention I originally thought of extending this to pack() for 'main', but realized that passing in the function to set would require tools to import the code they are simply trying to pack and that was the wrong thing to do.

BTW, while I was looking at the API, I realised I don't like the order of arguments in pack(). I'm tempted to make it pack(directory, target=None, interpreter=None, main=None) where a target of None means "use the name of the source directory with .pyz tacked on", exactly as for the command line API. What do you think? The change would be no more than a few minutes' work if it's acceptable.

+1 from me.

-Brett

Paul

[1] What's the standard practice for such dual-mode arguments? ZipFile tests if the argument is a str instance and assumes a file if not. I'd be inclined to follow that practice here. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20150223/4ca70c69/attachment.html>



More information about the Python-Dev mailing list