[Python-ideas] Tulip / PEP 3156 (original) (raw)

[Python-ideas] Tulip / PEP 3156 - subprocess events

Guido van Rossum guido at python.org
Sun Jan 20 05:35:04 CET 2013

Previous message: [Python-ideas] Tulip / PEP 3156 - subprocess events
Next message: [Python-ideas] Tulip / PEP 3156 - subprocess events
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, Jan 19, 2013 at 5:51 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

But the trade-off in separating protocol creation from notification of the connection is that it means every protocol has to be written to handle the "no connection yet" gap between init and the call to connectionmade.

That doesn't strike me as a problematic design. I've seen it plenty of times.

However, if we instead delay the call to the protocol factory until after the connection is made, then most protocols can be written assuming they always have a connection (at least until connectionlost is called). A persistent protocol that spanned multiple connect/reconnect cycles could be written such that you passed "myprotocol.connectionmade" as the protocol factory, while normal protocols (that last only the length of a single connection) would pass "MyProtocol" directly.

Well, almost. connection_made() would have to return self to make this work. But we could certainly use add some other method that did that.

(At first I thought it would be harder to pass other parameters to the constructor for the non-reconnecting case, but the solution is about the same as before -- use a partial function or a lambda that takes a protocol and calls the constructor with that and whatever other parameters it wants to pass.)

At the transport layer, the two states "has a protocol" and "has a connection" could then be collapsed into one - if there is a connection, then there will be a protocol, and vice-versa. This differs from the current status in PEP 3156, where it's possible for a transport to have a protocol without a connection if it calls the protocol factory well before calling connectionmade.

This doesn't strike me as important. The code I've written for Tulip puts most of the connection-making code outside the transport, and the transport constructor is completely private. Every transport implementation is completely free in how it works, and every event loop implementation is free to put as much or as little of the connection set-up in the transport as it wants to. The same is true for transports written by users (and there will be some of these). The only things we care about for transports is that the thing passed to the protocol's connection_made() has the methods specified by the PEP (write(), writelines(), pause(), resume(), and a few more). Also, it does not matter one iota whether it is the transport or some other entity that calls the protocol's methods (connection_made(), data_received(), etc.) -- the only thing that matters is the order in which they are called.

IOW, even though a transport may "have" a protocol without a connection, nobody should care about that state, and nobody should be calling its methods (again, write() etc.) in that state. In fact, nobody except event loop internal code should ever have a reference to a transport in that state. (The transport that is returned by create_connection() is fully connected to the socket (or whatever might takes its place) as well as to the protocol.)

I think we can make the same assumptions for transports implemented by user code.

Now, it may be that there's a good reason why conflating "has a protocol" and "has a connection" at the transport layer is a bad idea, and thus we actually need the "protocol creation" and "protocol association with a connection" events to be distinct. However, the PEP currently doesn't explain why it's necessary to separate the two, hence the confusion for at least Greg, Ben and myself.

So, your whole point here seems to be that you'd rather see the PEP specify that the sequence when a connection is made is

protocol = protocol_factory(transport)

rather than

protocol = protocol_factory() protocol.connection_made(transport)

I looked in the Tulip code to see whether this would cause any problems. I think it could be done, but the solution would feel a little awkward to me, because currently the protocol's connection_made() method is not called directly by the transport: it is called indirectly via the event loop's call_soon() method. So using your approach the transport wouldn't have a protocol attribute until this callback is called -- or we'd have to change things to call it directly rather than via call_soon(). Now I'm pretty sure I can prove that nothing will be referencing the protocol before the connection_made() call is actually made, and also that directly calling it instead of using call_soon() is fine. But nevertheless the transport code would feel a little harder to reason about.

Given that new protocol implementations should be significantly more common than new transport implementations, there's a strong case to be made for pushing any required complexity into the transports.

TBH I don't see the protocol implementation getting any simpler because of this. There is some protocol initialization code that doesn't depend on the transport, and some that does. Using your approach, these all go in init(). Using the PEP's current proposal, the latter go in a separate method, connection_made(). But using your approach, writing the lambda or partial function that calls the constructor with the right arguments (to be passed as protocol_factory) becomes a tad more complex, since now it must take a transport argument. On the third hand, rigging things so that a pre-existing protocol instance can be reused becomes a little harder to figure out, since you have to write a helper method that takes a transport and returns the protocol (i.e., self).

All in all I see it as six of one, half a dozen of the other, and I am happy with Glyph's testimony that the Twisted design works well in practice.

-- --Guido van Rossum (python.org/~guido)

Previous message: [Python-ideas] Tulip / PEP 3156 - subprocess events
Next message: [Python-ideas] Tulip / PEP 3156 - subprocess events
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-ideas mailing list