[Python-Dev] Context management patterns (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Sat Oct 19 08:38:39 CEST 2013
- Previous message: [Python-Dev] [Python-checkins] cpython: Issue #18810: Be optimistic with stat calls when seeing if a directory
- Next message: [Python-Dev] Context management patterns
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 18 October 2013 03:25, Glenn Linderman <v+python at g.nevcal.com> wrote:
First, thanks for the education. What you wrote is extremely edifying about more than just context managers, and I really appreciate the visionary understanding you reported from BrisPy and further elucidated on, regarding the educational pattern of using things before you learn how they work... that applies strongly in arenas other than programming as well:
- you learn how to walk before you understand the musculoskeletal physics - you learn how to turn on/off the lights before you understand how electricity works - you learn how to drive before you learn how/why a vehicle works - you learn how to speak before you understand how grammar works - you learn how to locate the constellations before you understand interplanetary gravitational forces - many, many, many, many more things And of course, many people never reach the understanding of how or why for many things they commonly use, do, or observe. That's why some people make things happen, some people watch what happens, and some people wonder "What happened?" What it doesn't do, though is address the dubious part of the whole construct, which is composition.
However, it's important to be clear as to whether the composition problems are specifically with the context manager form or if they also apply to the underlying exception handling pattern.
Barry raised a good point the other day about how context managers encapsulate exception handling patterns, and that the level shift happening with contextlib.suppress vs previous standard library usage is that it actually takes advantage of the fact that the "with" statement is a control flow construct that supports suppressing raised exceptions. By contrast, the previously extracted patterns encountered in the standard library are all about correct resource handling and use either the try-finally or the try-except-raise patterns that don't impact control flow.
Here are the clearest patterns I've personally noticed in the time since the with statement was added:
deterministic resource management (not a control flow pattern)
try: finally: x.close() # Or otherwise clean up
- aside from calling close() methods, generally resource specific
- closing files, sockets, etc
- dropping memoryview buffer references
- contextlib.closing
deterministic state management (not a control flow pattern)
try: finally:
- specific to the state being managed
- lock acquisition/release pairs
- decimal context management
- monkey patching (including standard stream redirection)
transaction management (not a control flow pattern)
try: except: raise else:
- database sessions
- conditional resource cleanup (i.e. only in failure case)
- specific to the kind of transaction being managed
logging unhandled exceptions (not a control flow pattern)
try: except Exception as exc: # Exception is a good default, but needs
to be configurable log.exception(exc) raise # This is important, since suppressing the exception is a separate decision
specific to a logging framework
in practice, usually just written out and combined with exception suppression in long running server processes
not commonly seen in scripts, since those rely on the unhandled exception display in the interpreter
stdlib logging module could possibly offer with a "logging.log_unhandled" context manager, permitting things like:
with suppress(Exception) with log_unhandled():
suppressing expected exceptions that aren't errors (control flow pattern!)
try: except : pass
- general pattern, now explicitly named as contextlib.suppress
- indicates exceptions that aren't really "exceptions" in that specific context, but just a different acceptable result
- once converted to the class based implementation, will be stateless and reusable
- consider if, instead of accepting things like "ignore_errors" flags and/or error callbacks, iteration constructs like shutil.rmtree instead accepted a context manager to wrap around the innermost calls where exceptions are anticipated.
There are now two more control flow patterns that I'm considering adding to contextlib. However, I'd experiment with them in contextlib2 before adding them to the standard library's contextlib module:
delayed failure handling
try: i = data.index(target) except IndexError: i = None
later
if i is None: # Handle the "not found" case else: # Do something with the value
this could be rewritten as (I believe credit is due to RDM for the name):
with catch(IndexError) as missing: i = data.index(target)
later
if missing: # Handle the "not found" case # The caught exception would be available as missing.exception else: # Do something with the value
unittest.TestCase.assertRaises is an existing construct along these lines
can use a similarly stateless and reusable class-based implementation to that which will be used for suppress
such a "catch" decorator could also take care of saving the exception and calling traceback.clear_frames() on it for easy introspection without inadvertently keeping vast swathes of local objects alive in CPython
unfortunately, warnings.catch_warnings is misnamed - it's really a state management context manager for the warning filter state, with an option to record warnings via monkeypatching. Alas, I understand these concepts far better now than I did back when we extracted that API from the 'check_warnings' helper in the test suite, so I didn't realise the name was wrong until long after it had been published. If we add contextlib.catch in 3.5, it would probably be worth going through the deprecation dance needed to rename catch_warnings to something more sensible like "warning_context".
constrained jumps
Search loop
for item in data: if is_desired_result(item): result = item break else: # Handle the "not found" case
Do something with "result"
using a suitable context manager, this can be rewritten as:
Search loop
with exit_label() as found: for item in data: if is_desired_result(item): found.exit(item)
Later
if found: # Do something with "found.value" else: # Handle the "not found" case
this is the exit_label() idea I posted earlier, but without the ability to specify the exception type (since I realised that's a separate pattern, better handled as the distinct "catch" construct)
rather than replacing exception handling constructs, it replaces break/else search loops with something that's hopefully easier to understand
it's also a generalisation of the SystemExit/GeneratorExit pattern, so the exception types it uses for internal flow control would inherit directly from BaseException
this is the pattern on the list that gets the closest to "goto" like behaviour, since it permits arbitrary "bail out now" behaviour (by passing the exit label to other operations), but the fact every label uses a custom exception type derived directly from BaseException would mean it still ends up being quite heavily constrained (since the only thing it would be able to catch is the exception thrown by calling the exit() method on the label).
this pattern would be stateful and explicitly not reusable (acting as a further constraint on abuse)
On 10/17/2013 8:26 AM, Nick Coghlan wrote:
And even a two line version: with suppress(FileNotFoundError): os.remove("somefile.tmp") with suppress(FileNotFoundError): os.remove("someotherfile.tmp")
The above example, especially if extended beyond two files, begs to used in a loop, like your 5 line version: for name in ("somefile.tmp", "someotherfile.tmp"): with suppress(FileNotFoundError): os.remove(name) which would be fine, of course. But to some with less education about the how and why, it is not clear why it couldn't be written like: with suppress(FileNotFoundError): for name in ("somefile.tmp", "someotherfile.tmp"): os.remove(name) yet to the cognoscenti, it is obvious there are seriously different semantics.
However, that's a confusion about exception handling in general, not about the suppress context manager in particular. The same potential for conceptual confusion exists between:
for name in ("somefile.tmp", "someotherfile.tmp"):
try:
os.remove(name)
except FileNotFoundError:
pass
and:
try:
for name in ("somefile.tmp", "someotherfile.tmp"):
os.remove(name)
except FileNotFoundError:
pass
At the syntactic level, when composing compound statements, the order of nesting always matters. The with/for and for/with constructs are different, just as if/for and for/if are different. If a student makes it through an introductory Python course without learning that much, I'd have grave doubts about that course :)
In my own code, I have a safedelete function to bundle the exception handling and the os.remove, and when factored that way, the temptation to nest the loop inside the suppress is gone. With suppress available, though, and if used, the temptation to factor it, either correctly or incorrectly, appears. How many cut-n-paste programmers will get it right and how many will get it wrong, is the serious question here, I think, and while suppress is a slightly better term than ignore, it still hides the implications to the control flow when an exception is actually raised within the block.
A lot of written out exception handling is better abstracted away into a helper function. However, the body of the try block can't always be factored out without creating swiss-army functions with more knobs and dials than subprocess.Popen, and those are the cases where using a context manager instead really shines.
I'm still dubious that the benefits of this simpler construct, while an interesting composition of powerful underlying constructs, has sufficient benefit to outweigh the naïve user's potential for misusing it (exacerbated by a name that doesn't imply control flow), or even the extra cost in performance per the microbenchmark someone published.
In the case of contextlib.suppress, my interest is mostly in giving this particular pattern a name, although it also allows for finer granularity in deciding exactly which exceptions to ignore. For example, most typical "safe_remove" functions ignore all OSErrors, while the contextlib.suppress example in the docs is deliberately constrained to only ignore FileNotFoundError (since "I don't care if it's already gone" is certainly a reasonable thing to say, while "I don't care if I don't have permission to remove it" is more dubious).
This inherent ability to suppress exceptions means that "with" statements are a control flow construct, and always have been since we added them in PEP 343. However, this also gets back to Barry's point about there being a category shift here: contextlib.suppress is the first standard library context manager to actually make use of the fact that with statements are a control flow construct just as much as try/except/else/finally statements are (just a more constrained one at the point of use, which makes them easier to understand).
Your more complex examples for future versions may have greater merit because they provide a significantly greater reduction in complexity to offset the significantly greater learning curve required to use and understand them. But even those look like an expensive form of goto (of course, goto is considered harmful, and I generally agree with the reasons why, but have coded them in situations where they are more useful than harmful in languages which support them).
I imagine that everyone on python-dev is aware that most of the control flow constructs in structured programming (which is a subset of OO) are to control the context of the CPUs "instruction pointer" without the use of "goto". The real problem with "goto" is not that the instruction pointer is changed non-sequentially, but that arbitrary changes can easily violate poorly documented preconditions of the target location. Hence, structured programming is really an attempt to avoid writing documentation, a laudable goal as the documentation is seldom sufficient at that level of detail... or if sufficient, is repetitive and overwhelming to create, maintain, and comprehend. It achieves that by making control flow constructs that are "higher level" than goto, that have meanings that can be understood and explained in educational texts, which then are implicit documentation for those control flow aspects of a particular program. OO builds on structured programming to make neat packages of state and control flow, to isolate state into understandable chunks so that larger programs can be comprehended, as the BrisPy presenter enlightened us, without understanding all the details of how each object and function within it works. Programmers raised on OO and GUI toolkits are building more and more systems out of more complex parts, which increases productivity, and that is good, although when they fail to fully understand the parts, some "interesting" performance characteristics can result. ignore/suppress seems to me to be a sledge hammer solution for driving a tack. The tack may be driven successfully, but the potential for damage to the surroundings (by misunderstanding the control flow implications) is sufficient to make me dubious regarding its overall value. Adequate documentation may help (if it is both provided and read), but the best constructs are those that are self-documenting, or well documented in existing "programming 101" books. I haven't seen this construct in other languages, nor has such a comparison been made in this thread, so I consider the potential for misuse large. My conclusion: suppress considered harmful, hidden goto within :)
I believe your underlying concerns are actually with the non-local flow control possibilities that are inherent in a language that offers both exceptions and the ability to suppress them. Since I'm firmly convinced that offering more structured exception handling is a vastly better solution to that problem than alternatives like Go's retreat to C-style return codes, I'll be continuing down the path of trying to extract and formalise particular patterns that constitute reasonable and common patterns for try/except/else/finally (or break/else loops!) in Python.
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] [Python-checkins] cpython: Issue #18810: Be optimistic with stat calls when seeing if a directory
- Next message: [Python-Dev] Context management patterns
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]