[Python-Dev] [PEP 3148] futures - execute computations asynchronously (original) (raw)

Brian Quinlan brian at sweetapp.com
Sat Mar 6 11:32:50 CET 2010


On 6 Mar 2010, at 17:50, Phillip J. Eby wrote:

At 01:19 AM 3/6/2010, Jeffrey Yasskin wrote:

On Fri, Mar 5, 2010 at 10:11 PM, Phillip J. Eby <pje at telecommunity.com_ _> wrote: > I'm somewhat concerned that, as described, the proposed API ... [creates] yet another alternative (and > mutually incompatible) event loop system in the stdlib ...

Futures are a blocking construct; they don't involve an event loop. And where they block is in a loop, waiting for events (completed promises) coming back from other threads or processes. The Motivation section of the PEP also stresses avoiding reinvention of such loops, and points to the complication of using more than one at a time as a justification for the mechanism. It seems relevant to at least address why wrapping multiprocessing and multithreading is appropriate, but not dealing with any other form of sync/async boundary, or composition of futures. On which subject, I might add, the PEP is silent on whether executors are reentrant to the called code. That is, can I call a piece of code that uses futures, using the futures API? How will the called code know what executor to use? Must I pass it one explicitly? Will that work across threads and processes, without explicit support from the API?

Executors are reentrant but deadlock is possible. There are two
deadlock examples in the PEP.

IOW, as far as I can tell from the PEP, it doesn't look like you can compose futures without global knowledge of the application... and in and of itself, this seems to negate the PEP's own motivation to prevent duplication of parallel execution handling! That is, if I use code from module A and module B that both want to invoke tasks asynchronously, and I want to invoke A and B asynchronously, what happens? Based on the design of the API, it appears there is nothing you can do except refactor A and B to take an executor in a parameter, instead of creating their own.

A and B could both use their own executor instances. You would need to
refactor A and B if you wanted to manage thread and process counts
globally.

It seems therefore to me that either the proposal does not define its scope/motivation very well, or it is not well-equipped to address the problem it's setting out to solve. If it's meant to be something less ambitious -- more like a recipe or example -- it should properly motivate that scope. If it's intended to be a robust tool for composing different pieces of code, OTOH, it should absolutely address the issue of writing composable code... since, that seems to be what it says the purpose of the API is. (I.e., composing code to use a common waiting loop.)

My original motivation when designing this module was having to deal
with a lot of code that looks like this:

def get_some_user_info(user): x = make_ldap_call1(user) y = make_ldap_call2(user) z = [make_db_call(user, i) for i in something]

Do some processing with x, y, z and return a result

Doing these operations serially is too slow. So how do I parallelize
them? Using the threading module is the obvious choice but having to
create my own work/result queue every time I encounter this pattern is
annoying. The futures module lets you write this as:

def get_some_user_info(user): with ThreadPoolExecutor(max_threads=10) as executor: x_future = executor.submit(make_ldap_call1, user) y_future = executor.submit(make_ldap_call2, user) z_futures = [executor.submit(make_db_call, user, i) for i in
something] finished, _ = wait([x_future, y_future] + z_futures,
return_when=FIRST_EXCEPTION) for f in finished: if f.exception(): raise f.exception() x = x_future.result() y = y_future.result() z = [f.result() for f in z_futures]

Do some processing with x, y, z and return a result

And, existing Python async APIs (such as Twisted's Deferreds) actually address this issue of composition; the PEP does not. Hence my comments about not looking at existing implementations for API and implementation guidance. (With respect to what the API needs, and how it needs to do it, not necessarily directly copying actual APIs or implementations. Certainly some of the Deferred API naming has a rather, um, "twisted" vocabulary.)

Using twisted (or any other asynchronous I/O framework) forces you to
rewrite your I/O code. Futures do not.

Cheers, Brian



More information about the Python-Dev mailing list