[Python-Dev] Asking for feedback about fixing `ftplib' (issue25458) (original) (raw)

Giampaolo Rodola' g.rodola at gmail.com
Wed Dec 28 13:02:49 EST 2016


On Wed, Dec 28, 2016 at 3:57 PM, Ivan Pozdeev <vano at mail.mipt.ru> wrote:

On 25.12.2016 17:21, Giampaolo Rodola' wrote:

From what I understand the problem should be fixed in urllib so that it always closes the FTP connection. I understand this is what happens in recent 3.x versions but not on 2.7. urllib in 2.7 should indeed be fixed to close FTP connections rather than leave them dangling, no question about that.

To me it looks like this is the only issue (currently tracked in http://bugs.python.org/issue28931).

I tried to fix urllib so that it's guaranteed to call voidresp(), and failed. Too complicated, effectively reimplementing a sizable part of FTP client logic is required, perhaps even monkey-patching ftplib to be able to set flags in the right places.

Why did you fail? Why urllib can't close() -> voidresp() the FTP session once it's done with it?

With simple commands (voidcmd) and self-contained transfer commands (retrlines/retrbinary), ftplib does incapsulate statefulness, by handling them atomically. But with transfercmd(), it returns control to the user halfway through.

That's by design. ftplib's transfercmd() just works like that: it's a low level method returning a "data" socket and it's just up to you to cleanly close the session (close() -> voidresp()) once you're done with it. Most of the times you don't even want to use transfercmd() in the first place, as you just use stor* and retr* methods.

At this point, if ftplib doesn't itself keep track that the control connection is currently in invalid state for the next command, the user is forced to do that themselves instead.

Hence why I suggested a docfix.

And guess what - urllib has to use ftplib through a wrapper class that does exactly that. That's definitely a "stock logic part" 'cuz it's an integral part of FTP rather than something specific to user logic.

That wrapper should just cleanly close the session.

The other "stock logic part" currently missing is error handling during transfer. If the error's cause is local (like a local exception that results in closing the data socket prematurely, or the user closing it by hand, or an ABOR they sent midway), the error code from the end-of-transfer response should be ignored, otherwise, it shouldn't.

Absolutely not. Base ftplib should NOT take any deliberate decision such as ignoring an error. I can even come up with use cases: use ftplib to test FTP servers so that they do reply with an error code in certain conditions. Here's a couple examples in pyftpdlib:

https://github.com/giampaolo/pyftpdlib/blob/17bebe3714752b01e43ab60d2771308f4594bd99/pyftpdlib/test/test_functional.py#L1354-L1369

https://github.com/giampaolo/pyftpdlib/blob/17bebe3714752b01e43ab60d2771308f4594bd99/pyftpdlib/test/test_functional.py#L1973-L1980

Other than making ftplib keep track of the session's internal state while the user has control and providing the custom error handling logic to them, another solution is to never release control to the user midway, i.e. ban transfercmd() entirely and only provide self-contained retr*()/dir()/whatever, but let the callback signal them to abort the transfer. That way, the state would be kept implicitly - in the execution pointer.

Banning transfercmd() means renaming it to _transfercmd() which is a no-no for backward compatibility. Also, as I've shown above, transfercmd() does have a use case which relies on it behaving like that (return control midway).

-- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20161228/ee564c2e/attachment.html>



More information about the Python-Dev mailing list