(original) (raw)

On Wed, Dec 28, 2016 at 3:57 PM, Ivan Pozdeev <vano@mail.mipt.ru> wrote:

On 25.12.2016 17:21, Giampaolo Rodola' wrote:

From what I understand the problem should be fixed
in urllib so that it always closes the FTP connection. I
understand this is what happens in recent 3.x versions but not
on 2.7.

urllib in 2.7 should indeed be fixed to close FTP connections rather
than leave them dangling, no question about that.

To me it looks like this is the only issue (currently tracked in http://bugs.python.org/issue28931).

I tried to fix urllib so that it's guaranteed to call voidresp(),
and failed. Too complicated, effectively reimplementing a sizable
part of FTP client logic is required, perhaps even monkey-patching
ftplib to be able to set flags in the right places.

Why did you fail? Why urllib can't close() -> voidresp() the FTP session once it's done with it?

With simple commands (voidcmd) and
self-contained transfer commands (retrlines/retrbinary), ftplib does
incapsulate statefulness, by handling them atomically. But with
transfercmd(), it returns control to the user halfway through.

That's by design. ftplib's transfercmd() just works like that: it's a low level method returning a "data" socket and it's just up to you to cleanly close the session (close() -> voidresp()) once you're done with it. Most of the times you don't even want to use transfercmd() in the first place, as you just use stor* and retr* methods.

At this point, if ftplib doesn't itself keep track that the control
connection is currently in invalid state for the next command, the
user is forced to do that themselves instead.

Hence why I suggested a docfix.

And guess what -
urllib has to use ftplib through a wrapper class that does exactly
that. That's definitely a "stock logic part" 'cuz it's an integral
part of FTP rather than something specific to user logic.

That wrapper should just cleanly close the session.

The other "stock logic part" currently missing is error handling
during transfer. If the error's cause is local (like a local
exception that results in closing the data socket prematurely, or
the user closing it by hand, or an ABOR they sent midway), the error
code from the end-of-transfer response should be ignored, otherwise,
it shouldn't.

Absolutely not. Base ftplib should NOT take any deliberate decision such as ignoring an error.

I can even come up with use cases: use ftplib to test FTP servers so that they *do* reply with an error code in certain conditions. Here's a couple examples in pyftpdlib:

https://github.com/giampaolo/pyftpdlib/blob/17bebe3714752b01e43ab60d2771308f4594bd99/pyftpdlib/test/test_functional.py#L1354-L1369

https://github.com/giampaolo/pyftpdlib/blob/17bebe3714752b01e43ab60d2771308f4594bd99/pyftpdlib/test/test_functional.py#L1973-L1980

Other than making ftplib keep track of the session's internal state
while the user has control and providing the custom error handling
logic to them, another solution is to never release control to the
user midway, i.e. ban transfercmd() entirely and only provide
self-contained retr*()/dir()/whatever, but let the callback signal
them to abort the transfer. That way, the state would be kept
implicitly - in the execution pointer.

Banning transfercmd() means renaming it to _transfercmd() which is a no-no for backward compatibility. Also, as I've shown above, transfercmd() does have a use case which relies on it behaving like that (return control midway).

Giampaolo - http://grodola.blogspot.com