[Python-Dev] Status of PEP 3145 (original) (raw)

[Python-Dev] Status of PEP 3145 - Asynchronous I/O for subprocess.popen

Josiah Carlson josiah.carlson at gmail.com
Fri Mar 28 06:09:47 CET 2014


By digging into the internals of a subprocess produced by Popen(), you can write in a blocking manner to the stdin pipe, and read in a blocking manner from the stdout/stderr pipe(s). For scripting most command-line operations, the lack of timeouts and the ability to stop trying to read is as important as being able to spawn an external process. It kind-of kills that side of the usefulness of Python as a tool for scripting.

The question is not whether or not a user of Python can dig into the internals, make some calls, then get it to be non-blocking - the existence of two different patches to do so (the most recent of which is from 4 1/2 years ago) shows that it can be done. The question is whether or not the desire for the functionality warrants having functions or methods to perform these operations in the standard library.

I and others have claimed that it should go into the standard library. Heck, there was enough of a push that Eric got paid to write his version of the functionality for a GSoC project in 2009. There has even been activity on the bug itself unrelated to deferring discussions as recently as May 2012 (after which activity seems to have paused for reasons I don't know). Some people have raised reasonable questions about the API and implementation, but no one is willing to offer an alternative API that they think would be better, so discussions about implementation of a non-existent API for inclusion are moot.

But honestly, I have approximately zero faith that what I say or do will lead to the inclusion of any changes to the subprocess module. Which is why I'm offering to write a short example that uses asyncio for inclusion in the docs. It's not what I've wanted for almost 9 years, but at least it has a chance of actually happening. I'll take a chance at updating the docs instead of a 3 to 9 month bikeshedding just to lead to rejection any day.

So yeah. Someone want to make a decision? Tell me to write the docs, I will. Tell me to go take a long walk off a short pier, I'll thank you for your time and leave you alone.

On Thu, Mar 27, 2014 at 7:18 PM, Terry Reedy <tjreedy at udel.edu> wrote:

On 3/27/2014 9:16 PM, Josiah Carlson wrote:

You don't understand the point because you don't understand the feature request or PEP. That is probably my fault for not communicating the intent better in the past. The feature request and PEP were written to offer something like the below (or at least enough that the below could be built with minimal effort):

def dologin(...): proc = subprocess.Popen(...) current = proc.recv(timeout=5) lastline = current.rstrip().rpartition('\n')[-1] if lastline.endswith('login:'): proc.send(username) if proc.readline(timeout=5).rstrip().endswith('password:'): proc.send(password) if 'welcome' in proc.recv(timeout=5).lower(): return proc proc.kill() The API above can be very awkward (as shown :P ), but that's okay. From those building blocks a (minimally) enterprising user would add functionality to suit their needs. The existing subprocess module only offers two methods for any amount of communication over pipes with the subprocess: checkoutput() and communicate(), only the latter of which supports sending data (once, limited by system-level pipe buffer lengths). Neither allow for nontrivial interactions from a single subprocess.Popen() invocation. According to my reading of the doc, one should (in the absence of deadlocks, and without having timeouts) be able to use proc.stdin.write and proc.stdout.read. Do those not actually work? The purpose was to be able to communicate in a bidirectional manner with a subprocess without blocking, or practically speaking, blocking with a timeout. That's where the "async" term comes from. Again, there was never any intent to have the functionality be part of asyncore or any other asynchronous sockets framework, which is why there are no handle*() methods, readable(), writable(), etc. Your next questions will be: But why bother at all? Why not just build the piece you need inside asyncio? Why does this need anything more? The answer to those questions are wants and needs. If I'm a user that needs interactive subprocess handling, I want to be able to do something like the code snippet above. The last thing I need is to have to rewrite the way my application/script/whatever handles everything just because a new asynchronous IO library has been included in the Python standard library - it's a bit like selling you a 300bicyclewhenyouneeda300 bicycle when you need a 300bicyclewhenyouneeda20 wheel for your scooter. That there now exists the ability to have async subprocesses as part of asyncio is a fortunate happenstance, as the necessary underlying tools for building the above now exist in the standard library. It's a matter of properly embedding the asyncio-related bits inside a handful of functions to provide something like the above, which is what I was offering to write. But why not keep working on the subprocess module? Yep. Tried that. Coming up on 9 years since I created the feature request and original Activestate recipe. To go that route is going to be 2-3 times as much work as has already been dedicated to get somewhere remotely acceptable for inclusion in Python 3.5, but more likely, subsequent rejection for similar reasons why it has been in limbo. But here's the thing: I can build enough using asyncio in 30-40 lines of Python to offer something like the above API. The problem is that it really has no natural home. It uses asyncio, so makes no sense to put in subprocess. It doesn't fit the typical asyncio behavior, so doesn't make sense to put in asyncio. The required functionality isn't big enough to warrant a submodule anywhere. Heck, it's even way too small to toss into an external PyPI module. But in the docs? It would show an atypical, but not wholly unreasonable use of asyncio (the existing example already shows what I would consider to be an atypical use of asyncio). It would provide a good starting point for someone who just wants/needs something like the snippet above. It is yet another use-case for asyncio. And it could spawn a larger library for offering a more fleshed-out subprocess-related API, though that is probably more wishful thinking on my part than anything. - Josiah

On Thu, Mar 27, 2014 at 4:24 PM, Victor Stinner <victor.stinner at gmail.com <mailto:victor.stinner at gmail.com>> wrote: 2014-03-27 22:52 GMT+01:00 Josiah Carlson <josiah.carlson at gmail.com_ _<mailto:josiah.carlson at gmail.com>>: > * Because it is example docs, maybe a multi-week bikeshedding discussion > about API doesn't need to happen (as long as "read line", "read X bytes", > "read what is available", and "write this data" - all with timeouts - are > shown, people can build everything else they want/need) I don't understand this point. Using asyncio, you can read and write a single byte or a whole line. Using functions like asyncio.waitfor(), it's easy to add a timeout on such operation. Victor

-- Terry Jan Reedy


Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ josiah.carlson%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140327/d02d90fc/attachment-0001.html>



More information about the Python-Dev mailing list