msg21873 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2004-07-31 01:07 |
This looks like a bug to me: >>> import socket, httplib >>> socket.setdefaulttimeout(10) >>> httplib.HTTPConnection("www.python.org", 9999).connect() [.... 10 second delay ....] Traceback (most recent call last): File "", line 1, in ? File "e:\src\python-cvs\lib\httplib.py", line 548, in connect raise socket.error, msg socket.timeout: timed out >>> On Linux, there is no significant delay, and the traceback reads: Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.3/httplib.py", line 548, in connect raise socket.error, msg socket.error: (111, 'Connection refused') The linux result is what I expected on Windows. Sockets aren't my strong point, so I'd prefer someone confirming it is a real bug before I burn too much time on it. |
|
|
msg21874 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2004-08-01 23:38 |
Logged In: YES user_id=14198 Guido - it looks like this change was made by you in Rev 1.257. Can you please confirm the new behaviour is not correct and I will try and dig a little deeper. |
|
|
msg21875 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2004-08-02 00:31 |
Logged In: YES user_id=31435 I can confirm that Guido certainly didn't intend for a refused connection to wait for the timeout on Windows. A problem is that the attempt to connect here isn't returning WSAECONNREFUSED on Windows, it's returning WSAEWOULDBLOCK. If you set the default timeout back to None, the attempt to connect *does* return WSAECONNREFUSED on Windows. But for whatever reason, the Windows implementation of sockets appears to turn that into WSAEWOULDBLOCK if (and only if) the socket is in non-blocking mode. The problem then is trying to guess some way to figure out whether WSAEWOULDBLOCK on a Windows non-blocking socket connect *means* "there's no chance this will ever succeed" or "I can't connect immediately, but maybe I can later". It appears to mean both things . Note this: >>> s = socket.socket() >>> s.setblocking(0) >>> s.connect(("www.python.org", 9999)) Traceback (most recent call last): File "", line 1, in ? File "", line 1, in connect socket.error: (10035, 'The socket operation could not complete without blocking') Now at this point, the code essentially does this: >>> select.select([], [s], [], 10.0) ([], [], []) >>> and select waits 10 seconds before returning. However, if we do this instead (I'm adding a non- empty "error/exception" list argument): >>> select.select([], [s], [s], 10.0) ([], [], [<socket._socketobject object at 0x008EBA80>]) >>> then it returns immediately, with the socket in the exception list. So that's a clue. How can we tell *what* error occurred? Hmm. For the exception list, MS select docs say a socket will appear there when: "If processing a connect call (nonblocking), connection attempt failed " So the behavior so far matches the docs. Later it says """ If a socket is processing a connect call (nonblocking), failure of the connect attempt is indicated in exceptfds (application must then call getsockopt SO_ERROR to determine the error value to describe why the failure occurred). This document does not define which other errors will be included. """ So there you go : we have to add the socket to the select call's exception set. Then the select call won't wait forever. When it comes back, and there is an exception, we have to call getsockopt() with SO_ERROR to determine the cause. |
|
|
msg21876 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2004-08-02 10:07 |
Logged In: YES user_id=14198 Thanks Tim! It looks like this patch works. |
|
|
msg21877 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2004-08-02 14:29 |
Logged In: YES user_id=31435 I suspect the patch is close but not quite there yet. I believe select will return 1 now if the socket is in *either* of the writable or exception sets upon select's return, so that the patch loses the distinction between "ok, we finally connected" and "oops -- we can't connect". If so, to untangle that we need to pass in *distinct* sets to select, and when the return is > 0 it's an error case if and only if FS_SET then says the socket is in the exception set. If the socket is in the writable set instead, then the connect succeeded. |
|
|
msg21878 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2004-08-03 01:18 |
Logged In: YES user_id=14198 It is true select does return 1 for either the "can't connect" or "connected" cases. In the "connected" case, the getsockopt() returns 0 - hence the function returns 0, and WSASetLastError(0) has been called. ie, as far as I can see and test, the code works for both success and failure. However, I do agree that the MS docs don't explicitly state anywhere that what I am doing is OK, so I'm attaching a new patch as you suggest, and which also seems to work |
|
|
msg21879 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2004-08-03 02:42 |
Logged In: YES user_id=31435 Ya, it's always a good idea to blindly do what I say . Marked accepted and assigned to you -- please check it in. One nit: the comment saying the socket is in the readable set should say that the socket is in the writable set. We passed NULL for the readable set. I don't know what's worse -- the socket API or the Outlook API, eh? |
|
|
msg21880 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2004-08-03 05:08 |
Logged In: YES user_id=14198 Thanks Tim! Checking in socketmodule.c; new revision: 1.297; previous revision: 1.296 Do we want this on the 2.3 branch? |
|
|
msg21881 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2004-08-03 14:03 |
Logged In: YES user_id=31435 Closed the report. It *is* a bug, so I hope the patch said "bugfix candidate". If not, you'll have to backport it yourself . |
|
|