(original) (raw)



On Mon Feb 23 2015 at 3:51:18 PM Paul Moore <p.f.moore@gmail.com> wrote:
On 23 February 2015 at 18:40, Brett Cannon <brett@python.org> wrote:
\> Couldn't you just keep it in memory as bytes and then write directly over
\> the file? I realize that's a bit wasteful memory-wise but it is possible.
\> The docs could mention the memory cost is something to watch out for when
\> doing an in-place replacement. Heck the code could even make it an
\> io.BytesIO instance so the rest of the code doesn't have to care about this
\> special case.

The real problem with overwriting is if there's a failure during the
overwrite you lose the original file. My original API had overwrite as
the default, but I think the risk makes that a bad idea.

Couldn't you catch the exception, write the original file back out, and then re-raise the exception?

One option would be to allow outputs (TARGET in pack() and NEW\_ARCHIVE
in set\_interpreter()) to be open files (open for write in bytes mode)
as well as filenames\[1\]. Then the caller has the choice of how to
manage the output. The docs could include an example of overwriting
via a BytesIO object, and point out the risk.

That sounds like a good idea. No reason to do the file opening on someone's behalf when opening files is so easy and keeps layering abstractions at a good level. Would this extend also to the archive being read to be consistent?

I should mention I originally thought of extending this to pack() for 'main', but realized that passing in the function to set would require tools to import the code they are simply trying to pack and that was the wrong thing to do.

BTW, while I was looking at the API, I realised I don't like the order
of arguments in pack(). I'm tempted to make it pack(directory,
target=None, interpreter=None, main=None) where a target of None means
"use the name of the source directory with .pyz tacked on", exactly as
for the command line API.

What do you think? The change would be no more than a few minutes'
work if it's acceptable.

+1 from me.

-Brett
Paul

\[1\] What's the standard practice for such dual-mode arguments? ZipFile
tests if the argument is a str instance and assumes a file if not. I'd
be inclined to follow that practice here.