[Python-Dev] Re: Can we limit the effects of module execution to sys.modules? (was Fix import errors to have data) (original) (raw)

Tim Peters tim.peters at gmail.com
Sat Jul 31 06:17:15 CEST 2004


[Tim, wants to keep insane modules out of sys.modules]

[Jim Fulton]

I sympathize with your frustration with this problem, but I think that the problem is bigger that just sys.modules. For better or worse, importing a module may have side effects that extend beyond sys.modules. For example, In some applications, objects get registered into registries that exist in already-imported modules. Perhaps we want to declare this to be a poor style. If a module has an impact beyond new modules added to sys.modules, then removing all modules imported into sys.modules as a result of attempting the import would produce bugs even more subtle than what we have now.

I wouldn't want to remove all, just the modules that failed. For example,

A imports B
    B imports C # no problem
    B imports D # and that raises an exception not caught by B

C is fine, I only want to nuke D and B.

As to style, in my own code I strive to make modules "reload safe". So, for example, I wouldn't even consider doing one of these things as a side effect of merely importing a module:

Now that said, I've only seen imports wrapped in a try in two ways:

try: import X except ImportError: something

That's invariably trying to check for the availability of X, though, not also trying to check for whether something X imports doesn't exist. If you pursue a saner way to write that, I'll always use it.

try: import X except: something

That one is almost always a mistake, as a bare "except" is almost always a mistake in any context. The author almost always intended the same thing as #1, but was too lazy or inexperienced to write that. Bug in, bugs out. That a later attempt to import X doesn't also fail is a bug magnifier.

I've never seen something like

try: import X except ZeroDivisionError: something

If, as I suspect, nobody (and "almost nobody" is the same to me ) intends to catch an error from an import other than ImportError, then import errors other than ImportError are fatal soon after in practice, and then there's nothing much to worry about.

Catching ImportError still leaves insane modules around, though, and that does cause real problems. You've convinced me I'd rather have a better way to spell "does X exist?" than catching an ImportError from an attempt to import X.

Do you think it's practical to limit the effects of module import to sys.modules, even by convention?

I'm sure you didn't intend that to be so extreme -- like surely a module is allowed to initialize its own module-level variables. If I read "effects" as "effects visible outside the module", then that's what you said .

Could we say that it is a bug for code executed during module import to mutate other modules, including mutating objects contained in those other modules? I would support this myself.

It's hard to spell the intent precisely. "reload safe" covers a world of non-local (wrt the module in question) that both are and aren't problematic. For example, calling random.random() during module initialization should be fine, but it certainly mutates state, and irrevocably so, inside the random module. Because it's hard to be precise here, best practice is likely to remain more a matter of good judgment than of legislation.

If it is possible to limit the effects of import (even by convention), then I think it would be practical to roll-back changes to sys.modules. If it's not practical to limit the effects of module import then I think the problem is effectively unsolveable, short of making Python transactional.

There we don't agree -- I think it's already practical, based on that virtually no Python application intends to catch errors from imports other than ImportError, so that almost all "real bugs" in module initialization are intended to stop execution. In turn, in the cases where ImportErrors are intentionally caught now, they generally occur in "import blocks" near the starts of all modules in the failing import chain, and so none of the modules involved have yet done any non-trivial initialization -- they're all still trying to import the stuff they need to start doing the meat of their initialization. If some modules happen to import successfully along the way, fine, they should stay in sys.modules, and then importing them again later won't run their initialization code again. IOW, once a module has announced its sanity by importing successfully, I want that to "stick" no matter what happens later.

Personably I'm inclined to consider errors that occur while executing a module to be pretty much fatal. If a module has begun executing, I really don't know what state it's in or what state it might have left other modules in. I'd rather report the error and get some human to fix it.

I think that's widespread belief too. Heck, if Zope doesn't violate it, who else would be so perverse ?

OTOH, I'm happy to recover from the inability to find a module as long as no module code has been executed.

Having a clearer way to determine module availability/existence would be a real help.

FWIW, In Zope, we generally generally limit non-transactional state changes to program startup. For that reason, we make little or no attempt to survive startup errors.

I've never tried to survive a startup error myself either, nor have any Python projects I'm aware of attached to any of my previous employers. Anyone else?



More information about the Python-Dev mailing list