asyncio (original) (raw)

IMHO I believe the workaround you mentioned is not robust in that one may not control the main run loop (like when running a aiohttp server) ..

.. In my example I run an aiohttp server that dynamically launches processes from a worker-pool to process tens of thousands of S3 requests.

Forking from within a running HTTP server coroutine is a very bad idea. It's a blocking operation that might take non-trivial amount of time. You better pre-fork, or have a master process that controls its children processes and creates new ones when needed.

You can also use loop.subprocess_exec to safely run your S3 logic.

and further I worry of the heavy hand of stopping the event loop for complicated applications which have many tasks which are blocked waiting for input, timers, and tasks waiting of various events.

Event loop can safely be stopped and resumed, no matter how many timers/tasks you have. If you care about timeouts being triggered because you pause the loop to fork -- the same will happen when you use os.fork() (again, it's not a cheap-n-fast operation).

IMO if uvloop crashes that would be a problem in its run-loop impl and should be addressed separately and not affect the BaseEvent loop impl.

True (and that segfault will soon be fixed).

But there are problems even with forking pure python asyncio programs. For instance, epoll is fundamentally not fork-safe. If you continue to use the same event loop in the forked child process you will encounter bugs/crashes/wrong behaviour.

One solution is to use the multiprocessing module, which is supported by both asyncio and uvloop, or to do fork+exec manually.

FYI another example that just uses asyncio to fork:
[..] ProcessPoolExecutor example [..]

You can also use run_in_executor API with process pools -- that is also fully supported (because concurrent.futures uses multiprocessing).

Calling bare os.fork() is fundamentally unsafe, you simply should not use it. It's a low level syscall, and asyncio event loop is at least one level above it. What you are asking is to fix one particular aspect of the API, so that you can continue to use what worked by accident. But even if we make get_event_loop work you will eventually have other problems with os.fork.

gevent, for instance, monkey patches os.fork to make it work safe. As I said in my previous comment, we might want to add a specialized asyncio.fork method, but bare os.fork is very unlikely to to be fully supported ever.

I think the justification needs to be made if python's base modules are fork safe or not. If asyncio is not fork safe this seems to go against the other python base modules. What work is required to be able to subprocess the BaseEventLoop to make it fork safe?

Strictly speaking any network application (blocking or non-blocking) is not os.fork friendly. You have to do the forking with extra care, and generally people only do fork+exec. The officially recommended way to do multiprocessing is to use the multiprocessing package.

You also say forking from async from run-loop was never officially supported, but by the same argument it was never officially not-supported either from what I gather from the docs (https://docs.python.org/3/library/asyncio-eventloop.html), and it worked.

True, this is something we will hopefully fix very soon.