Issue 731644: GIL not released around getaddrinfo() (original) (raw)

Created on 2003-05-02 23:23 by jvr, last changed 2022-04-10 16:08 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
getaddrinfo_gil.patch jvr,2003-05-03 18:02 getaddrinfo vs. the GIL path #1
getaddrinfo_gil.patch jvr,2003-05-06 20:51 getaddrinfo vs. the GIL path #2
Messages (13)
msg15843 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-02 23:23
I have a situation where sock.connect() blocks in getaddrinfo(), but since the GIL isn't released around getaddrinfo() calls, other threads also block.
msg15844 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 09:03
Logged In: YES user_id=21627 This is not a bug. getaddrinfo isn't thread-safe on some systems, so we must protect it from multiple access.
msg15845 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 10:09
Logged In: YES user_id=92689 Isn't that an orthogonal issue? In that we should then use a lock for getaddrinfo(), yet do release the GIL. I find it unacceptable that my GUI main thread can hang for several seconds (or more) just by doing a sock.connect() in a different thread. Short of that, can we find out on _which_ platforms getaddrinfo() isn't thread-safe and work around that?
msg15846 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 12:18
Logged In: YES user_id=21627 See python.org/sf/604210. Yes, using a separate lock would be possible. At a minimum, the BSD implementation and the getaddrinfo emulation are not thread-safe; there may be more. As I write in 604210: it would be fine if a separate lock is used only on the known-bad platforms, since the RFC states getaddrinfo is thread-safe. Would you like to work on a patch?
msg15847 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2003-05-03 12:30
Logged In: YES user_id=14198 How about simply not releasing the GIL on these platforms? Seems simpler than a new lock.
msg15848 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 13:40
Logged In: YES user_id=92689 Since my platform is BSD-ish, I want a lock ;-) Yes, I would like to work on a patch, except I will need some hand-holding, as I've never done anything with threads from C. Can you point me to examples of such a lock in the Python source code?
msg15849 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 13:56
Logged In: YES user_id=92689 (Never mind, I found plenty of examples in sockemodule.c itself.)
msg15850 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 14:00
Logged In: YES user_id=21627 Mark: the original complaint came from BSD users. So if we only release the GIL on a "good" platform, the BSD users aren't helped. Likewise, unless compiled by VC7.x, Windows users are not helped, since Python would use the getaddrinfo emulation. So if this as attempted, I think we need a best-effort solution for all systems. You could look at the Tcl lock. I recommend you look at the 2.2 sources, since the Tcl lock logic got more complicated recently (as Tcl now also has a multi-threaded mode of operation). If your platform is Mac OS X, don't trust that it is too BSDish in this respect. If both FreeBSD and NeXT had some function, Apple apparently has always chosen the NeXT version. getaddrinfo was one such case. But then, it appears that the NeXT version is even worse than the FreeBSD version.
msg15851 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 18:02
Logged In: YES user_id=92689 I've attached a first cut of a patch. It implements wrappers for getaddrinfo and getnameinfo, to take the burden of dealing with the lock and the GIL away from the caller. As suggested by MvL in prvt mail, I've simply reused the existing gethostbyname lock, but renamed it to netdb_lock. What needs to be reviewed carefully is when exactly USE_NETDB_LOCK (formerly known as USE_GETHOSTBYNAME_LOCK) is defined. It happens to be defined for my platform, so it "works for me".
msg15852 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 21:45
Logged In: YES user_id=21627 The patch is incorrect: usage of locked getaddrinfo depends on whether gethostbyname_r is supported on the platform. However, this is unrelated: A platform may have gethostbyname_r, and still not provide a thread-safe getaddrinfo, or it may not have gethostbyname_r, and yet provide a thread-safe getaddrinfo. I suggest you separate issues: 1. Should the lock variable be declared and initialized? You have two options: a) only create if needed (see below), or b) always create if Python is threaded 2. Should getaddrinfo use the lock? In general, it shouldn't, except for a few known-bad systems. 3. Should gethostbyname use the lock? Yes, unless the reentrant versions of this function are used. Notice that for Windows, gethostbyname itself is thread-safe, as it allocates a thread-local hostent. I think the lock management code is also incorrect: There is a certain moment after getaddrinfo returns where neither the GIL nor the getaddrinfo lock is held, yet Python accesses the getaddrinfo result afterwards. If getaddrinfo returns a pointer to global data, a second call could modify the result even though we have not processed it yet. I'm not sure how to solve this. It is probably safest to keep the getaddrinfo lock until freeaddrinfo has been called. Also, for systems that don't lock getaddrinfo, I would prefer if py_getaddrinfo was a define for getaddrinfo.
msg15853 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-06 20:51
Logged In: YES user_id=92689 Here's a new attempt. Martin and I went back and forth a few times, and what I now ended up with isn't all _that_ different from the initial patch, except it no longer uses wrapped functions.
msg15854 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-09 07:44
Logged In: YES user_id=21627 The patch looks fine, and works on Linux for me. Please apply it.
msg15855 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-09 07:54
Logged In: YES user_id=92689 Thanks! Applied as socketmodule.c rev. 1.265.
History
Date User Action Args
2022-04-10 16:08:33 admin set github: 38435
2003-05-02 23:23:59 jvr create