[Python-Dev] The endless GIL debate: why not remove thread support instead? (original) (raw)
Sturla Molden sturla at molden.no
Fri Dec 12 02:13:13 CET 2008
- Previous message: [Python-Dev] Trap SIGSEGV and SIGFPE
- Next message: [Python-Dev] The endless GIL debate: why not remove thread support instead?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Last month there was a discussion on Python-Dev regarding removal of reference counting to remove the GIL. I hope you forgive me for continuing the debate.
I think reference counting is a good feature. It prevents huge piles of garbage from building up. It makes the interpreter run more smoothly. It is not just important for games and multimedia applications, but also servers under high load. Python does not pause to look for garbage like Java or .NET. It only pauses to look for dead reference cycles. This can be safely turned off temporarily; it can be turned off completely if you do not create reference cycles. With Java and .NET, no garbage is ever reclaimed except by the intermittent garbage collection. Python always reclaims an object when the reference count drops to zero – whether the GC is enabled or not. This makes Python programs well-behaved. For this reason, I think removing reference counting is a genuinely bad idea. Even if the GIL is evil, this remedy is even worse.
I am not a Python core developer; I am a research scientist who use Python because Matlab is (or used to be) a bad programming language, albeit a good computing environment. As most people who have worked with scientific computing know, there are better paradigms for concurrency than threads. In particular, there are message-passing systems like MPI and Erlang, and there are autovectorizing compilers for OpenMP and Fortran 90/95. There are special LAPACK, BLAS and FFT libraries for parallel computer architectures. There are fork-join systems like cilk and java.util.concurrent. Threads seem to be used only because mediocre programmers don't know what else to use.
I genuinely think the use of threads should be discouraged. It leads to code that are full of bugs and difficult to maintain - race conditions, deadlocks, and livelocks are common pitfalls. Very few developers are capable of implementing efficient load-balancing by hand. Multi-threaded programs tend to scale badly because they are badly written. If the GIL discourages the abuse of threads, it serves a purpose albeit being evil like the Linux kernel's BKL.
Python could be better off doing what tcl does. Allow each process to embed multiple interpreters; run each interpreter in its own thread. Implement a fast message-passing system between the interpreters (e.g. copy-on-write by making communicated objects immutable), and Python would be closer to Erlang than Java.
I thus think the main offender is the thread and threading modules - not the GIL. Without thread support in the interpreter, there would be no threads. Without threads, there would be no need for a GIL. Both sources of evil can be removed by just removing thread support from the Python interpreter. In addition, it would make Python faster at executing linear code. Just copy the concurrency model of Erlang instead of Java and get rid of those nasty threads. In the meanwhile, I'll continue to experiment with multiprocessing.
Removing reference counting to encourage the use of threads is like shooting ourselves in the leg twice. That’s my two cents on this issue.
There is another issue to note as well: If you can endure a 200x loss of efficacy by using Python instead of Fortran, scalability on dual or quad-core processors may not be that important. Just move the bottlenecks out of Python and you are much better off.
Regards, Sturla Molden
- Previous message: [Python-Dev] Trap SIGSEGV and SIGFPE
- Next message: [Python-Dev] The endless GIL debate: why not remove thread support instead?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]