(original) (raw)

On Sun, Mar 11, 2012 at 14:08, Guido van Rossum <guido@python.org> wrote:

On Sun, Mar 11, 2012 at 12:26 PM, Thomas Wouters <thomas@python.org> wrote:
> Thanks for the suggestions (Antoine too), but that's not really the topic I
> want to discuss here (but if you guys move to Google I'll happily discuss
> all the stuff we have to deal with.) The questions is really whether Python
> wants to actually support zipped stdlibs or not.

I do want to support it; that's why we put the facilities you found
there in the first place. Unfortunately nobody actually did the
necessary second step of trying to bundle the stdlib and trying to
make the tests pass. So I think it would be great if we addressed the
issues you found, or at least started prioritizing them.

I'm not sure if you're saying that you're hitting the 2 GB limit \*with
just the stdlib\* in a zipfile, or if you're hitting this after you've
added a bunch of Google code to it as well.

No, not with just the stdlib, but in a Google binary that embeds Python -- the 32-bit-unsigned numbers in zipfiles are file offsets, so in a Google binary (which, as you know, is typically a completely statically linked binary) the offsets for a zipfile embedded in the binary can be bigger than that. (If you were thinking of PAR files, those don't use zipimport themselves, but their own PEP-302 importer written in Python with the zipfile module, so it's okay.)

I'm also not sure that
it's worth the effort to make \*all\* the tests in the stdlib pass --
some tests may just be testing filesystem things that make no sense
when the stdlib is in a zipfile. I see you frowning already about my
lax attitide...

Hah, no, I wasn't frowning when I read that :) I don't care about making all tests pass, but I do want them to not fail -- a test should only fail if the tested thing doesn't work, not if the test can't run. For what it's worth, the vast majority of tests work fine, there's just a couple that take what I would call unwarranted assumptions. For example, the zipfile module wants to test the writepy method, so it needs a module and a package to bundle in the zipfile. It could make a bunch of tempfiles (as most other tests do) into a package, but instead it uses email.\_\_file\_\_ to find the email package and uses that.

The only failing test I remember that wasn't of the kind of using the stdlib source out of laziness is test\_pyclbr, which runs pyclbr over a whole bunch of large stdlib modules. It also does other tests, so I don't think skipping the test for a zipped stdlib is a big deal, but even that could be fixed by using PEP 302's interface for getting the source. Of course, we also have to consider that the zipped stdlib may contain just .pyc files :)

So it's definitely possible to fix most tests, possibly all of them, without too much effort.

So let me add that all non-test code should definitely
work, and quite possible the only way to ensure that this is the case
is to make all the tests pass. The issue with needing os.py outside
the zipfile is a good thing to try to fix.

I forgot to include a link to http://bugs.python.org/issue12919 that makes that a little less confusing (to me, although others apparently disagreed :)

--
Thomas Wouters <thomas@python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!