[Python-Dev] microthreading vs. async io (original) (raw)

Armin Rigo arigo at tunes.org
Sun Feb 25 18🔞46 CET 2007


Hi Adam,

On Thu, Feb 15, 2007 at 06:17:03AM -0700, Adam Olsen wrote:

> E.g. have a wait(events = [], timeout = -1) method would be sufficient > for most cases, where an event would specify

I agree with everything except this. A simple function call would have O(n) cost, thus being unacceptable for servers with many open connections. Instead you need it to maintain a set of events and let you add or remove from that set as needed.

I just realized that this is not really true in the present context. If the goal is to support programs that "look like" they are multi-threaded, i.e. don't use callbacks, as I think is Joachim's goal, then most of the time the wait() function would be only called with a single event, rarely two or three, never more. Indeed, in this model a large server is implemented with many microthreads: at least one per client. Each of them blocks in a separate call to wait(). In each such call, only the events revelant to that client are mentioned.

In other words, the cost is O(n), but n is typically 1 or 2. It is not the total number of events that the whole application is currently waiting on. Indeed, the scheduler code doing the real OS call (e.g. to select()) can collect the events in internal dictionaries, or in Poll objects, or whatever, and update these dictionaries or Poll objects with the 1 or 2 new events that a call to wait() introduces. In this respect, the act of calling wait() already means "add these events to the set of all events that need waiting for", without the need for a separate API for doing that.

[Actually, I think that the simplicity of the wait(events=[]) interface over any add/remove/callback APIs is an argument in favor of the "microthread-looking" approach in general, though I know that it's a very subjective topic.]

[I have experimented myself with a greenlet-based system giving wrapper functions for os.read()/write() and socket.recv()/send(), and in this style of code we tend to simply spawn new greenlets all the time. Each one looks like an infinite loop doing a single simple job: read some data, process it, write the result somewhere else, start again. (The loops are not really infinite; e.g. if sockets are closed, an exception is generated, and it causes the greenlet to exit.) So far I've managed to always wait on a single event in each greenlet, but sometimes it was a bit contrieved and being able to wait on 2-3 events would be handy.]

A bientot,

Armin.



More information about the Python-Dev mailing list