msg15843 - (view) |
Author: Just van Rossum (jvr) *  |
Date: 2003-05-02 23:23 |
I have a situation where sock.connect() blocks in getaddrinfo(), but since the GIL isn't released around getaddrinfo() calls, other threads also block. |
|
|
msg15844 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2003-05-03 09:03 |
Logged In: YES user_id=21627 This is not a bug. getaddrinfo isn't thread-safe on some systems, so we must protect it from multiple access. |
|
|
msg15845 - (view) |
Author: Just van Rossum (jvr) *  |
Date: 2003-05-03 10:09 |
Logged In: YES user_id=92689 Isn't that an orthogonal issue? In that we should then use a lock for getaddrinfo(), yet do release the GIL. I find it unacceptable that my GUI main thread can hang for several seconds (or more) just by doing a sock.connect() in a different thread. Short of that, can we find out on _which_ platforms getaddrinfo() isn't thread-safe and work around that? |
|
|
msg15846 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2003-05-03 12:18 |
Logged In: YES user_id=21627 See python.org/sf/604210. Yes, using a separate lock would be possible. At a minimum, the BSD implementation and the getaddrinfo emulation are not thread-safe; there may be more. As I write in 604210: it would be fine if a separate lock is used only on the known-bad platforms, since the RFC states getaddrinfo is thread-safe. Would you like to work on a patch? |
|
|
msg15847 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2003-05-03 12:30 |
Logged In: YES user_id=14198 How about simply not releasing the GIL on these platforms? Seems simpler than a new lock. |
|
|
msg15848 - (view) |
Author: Just van Rossum (jvr) *  |
Date: 2003-05-03 13:40 |
Logged In: YES user_id=92689 Since my platform is BSD-ish, I want a lock ;-) Yes, I would like to work on a patch, except I will need some hand-holding, as I've never done anything with threads from C. Can you point me to examples of such a lock in the Python source code? |
|
|
msg15849 - (view) |
Author: Just van Rossum (jvr) *  |
Date: 2003-05-03 13:56 |
Logged In: YES user_id=92689 (Never mind, I found plenty of examples in sockemodule.c itself.) |
|
|
msg15850 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2003-05-03 14:00 |
Logged In: YES user_id=21627 Mark: the original complaint came from BSD users. So if we only release the GIL on a "good" platform, the BSD users aren't helped. Likewise, unless compiled by VC7.x, Windows users are not helped, since Python would use the getaddrinfo emulation. So if this as attempted, I think we need a best-effort solution for all systems. You could look at the Tcl lock. I recommend you look at the 2.2 sources, since the Tcl lock logic got more complicated recently (as Tcl now also has a multi-threaded mode of operation). If your platform is Mac OS X, don't trust that it is too BSDish in this respect. If both FreeBSD and NeXT had some function, Apple apparently has always chosen the NeXT version. getaddrinfo was one such case. But then, it appears that the NeXT version is even worse than the FreeBSD version. |
|
|
msg15851 - (view) |
Author: Just van Rossum (jvr) *  |
Date: 2003-05-03 18:02 |
Logged In: YES user_id=92689 I've attached a first cut of a patch. It implements wrappers for getaddrinfo and getnameinfo, to take the burden of dealing with the lock and the GIL away from the caller. As suggested by MvL in prvt mail, I've simply reused the existing gethostbyname lock, but renamed it to netdb_lock. What needs to be reviewed carefully is when exactly USE_NETDB_LOCK (formerly known as USE_GETHOSTBYNAME_LOCK) is defined. It happens to be defined for my platform, so it "works for me". |
|
|
msg15852 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2003-05-03 21:45 |
Logged In: YES user_id=21627 The patch is incorrect: usage of locked getaddrinfo depends on whether gethostbyname_r is supported on the platform. However, this is unrelated: A platform may have gethostbyname_r, and still not provide a thread-safe getaddrinfo, or it may not have gethostbyname_r, and yet provide a thread-safe getaddrinfo. I suggest you separate issues: 1. Should the lock variable be declared and initialized? You have two options: a) only create if needed (see below), or b) always create if Python is threaded 2. Should getaddrinfo use the lock? In general, it shouldn't, except for a few known-bad systems. 3. Should gethostbyname use the lock? Yes, unless the reentrant versions of this function are used. Notice that for Windows, gethostbyname itself is thread-safe, as it allocates a thread-local hostent. I think the lock management code is also incorrect: There is a certain moment after getaddrinfo returns where neither the GIL nor the getaddrinfo lock is held, yet Python accesses the getaddrinfo result afterwards. If getaddrinfo returns a pointer to global data, a second call could modify the result even though we have not processed it yet. I'm not sure how to solve this. It is probably safest to keep the getaddrinfo lock until freeaddrinfo has been called. Also, for systems that don't lock getaddrinfo, I would prefer if py_getaddrinfo was a define for getaddrinfo. |
|
|
msg15853 - (view) |
Author: Just van Rossum (jvr) *  |
Date: 2003-05-06 20:51 |
Logged In: YES user_id=92689 Here's a new attempt. Martin and I went back and forth a few times, and what I now ended up with isn't all _that_ different from the initial patch, except it no longer uses wrapped functions. |
|
|
msg15854 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2003-05-09 07:44 |
Logged In: YES user_id=21627 The patch looks fine, and works on Linux for me. Please apply it. |
|
|
msg15855 - (view) |
Author: Just van Rossum (jvr) *  |
Date: 2003-05-09 07:54 |
Logged In: YES user_id=92689 Thanks! Applied as socketmodule.c rev. 1.265. |
|
|