Message 276795 - Python tracker (original) (raw)
I finally found the actual problem causing the failure of second download. urlretrieve() works with FTP in PASV mode, and in PASV mode after sending the file to client, the FTP server sends an ACK that the file has been transferred. After the fix of socket was being closed without receiving this ACK.
Now, when a user tries to download the same file or another file from same directory, the key (host, port, dirs) remains the same so open_ftp() skips ftp initialization. Because of this skipping, previous FTP connection is reused and when new commands are sent to the server, server first sends the previous ACK. This causes a domino effect and each response gets delayed by one and we get an exception from parse227().
Expected response: cmd 'RETR Contents-udeb-ppc64el.gz' resp '150 Opening BINARY mode data connection for Contents-udeb-ppc64el.gz (26555 bytes).' resp '226 Transfer complete.'
*cmd* 'TYPE I'
*resp* '200 Switching to Binary mode.'
*cmd* 'PASV'
*resp* '227 Entering Passive Mode (130,239,18,173,137,59).'
Actual response: cmd 'RETR Contents-udeb-ppc64el.gz' resp '150 Opening BINARY mode data connection for Contents-udeb-ppc64el.gz (26555 bytes).'
*cmd* 'TYPE I'
*resp* '226 Transfer complete.'
*cmd* 'PASV'
*resp* '200 Switching to Binary mode.'
I am attaching a new patch (urllib.patch) which fixes this problem by clearing the FTP server responses first if an existing connection is being used to download a file. Please review and let me know if it looks good.