(original) (raw)

On Tue, Feb 16, 2016 at 9:42 PM, Gregory P. Smith <greg@krypto.org> wrote:

On Tue, Feb 16, 2016 at 9:00 PM Mike Kaplinskiy <mike.kaplinskiy@gmail.com> wrote:
Hey folks,

I hope this is the right list for this sort of thing (python-ideas seemed more far-fetched).

For some context: there is currently a issue with pex that causes sys.modules lookups to stop working for \_\_main\_\_. In turns this makes unittest.run() & pkg\_resources.resource\_\* fail. The root cause is that pex uses runpy.run\_module with alter\_sys=False. The fix should be to just pass alter\_sys=True, but that changes sys.argv\[0\] and various existing pex files depend on that being the pex file. You can read more at https://github.com/pantsbuild/pex/pull/211 .

Conservatively, I'd like to propose adding an argument to disable this behavior. The current behavior breaks a somewhat reasonable invariant that you can restart your program via \`os.execv(\[sys.executable\] + sys.argv)\`.

I don't know enough about pex to really dig into what it is trying to do so this is tangential to answering your question but:

Sorry about that - a pex file is kind of like a relocatable virtualenv in one zip file. When it runs, it first executes some pex-specific code to extract packages (.egg, .whl) and add them to sys.path before running the actual user code. It's conceptually similar to a fat .jar file in JVM projects - all you need is \`python\`/\`java\` and all the code is in one file.

sys.executable may be None. ex: If you're an embedded Python interpreter there is no Python executable. It cannot be blindly used re-execute the current process.

sys.argv represents the C main() argv array. Your inclination (in the linked to bug above) to leave sys.argv\[0\] alone is a good one.

I was originally going to argue for getting rid of the feature entirely, but if runpy is to live up to the promise of being exactly the same as \`python -m XXX yyy zzz\`, it needs to be there. IMO it's bad form to depend on sys.argv\[0\] for anything but presentation purposes - usage messages and the like. It's hard to justify breaking compatibility for that though - unfortunately the runpy interface isn't pliable enough to really reimplement or unimplement this feature, and doing it by hand is...painful. You can also make an argument that \`python runmodule.py module a b\` and \`python -m module a b\` should be \_exactly\_ the same output, especially if \`runmodule.py\` is implementing something pass-through like profiling, coverage or tracing. A nicer interface might be some sort of callback to "do whatever you want before the module is executed", but that might be overkill.

-gps

Moreover it might be user-friendly to add a \`argv=sys.argv\[1:\]\` argument to set & restore the full arguments to the module, where \`argv=None\` disables argv\[0\] switching.

What do you think?

Mike.

\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/greg%40krypto.org