[Python-Dev] Summing up (original) (raw)
David Beazley dave at dabeaz.com
Wed May 19 02:35:48 CEST 2010
- Previous message: [Python-Dev] Unordered tuples/lists
- Next message: [Python-Dev] Summing up
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Antoine,
This is a pretty good summary that mirrors my thoughts on the GIL matter as well. In the big picture, I do think it's desirable for Python to address the multicore performance issue--namely to not have the performance needlessly thrashed in that environment. The original new GIL addressed this.
The I/O convoy effect problem is more subtle. Personally, I think it's an issue that at least merits further study because trying to overlap I/O with computation is a known programming technique that might be useful for people using Python to do message passing, distributed computation, etc. As an example, the multiprocessing module uses threads as part of its queue implementation. Is it impacted by convoying? I honestly don't know. I agree that getting some more real-world experience would be useful.
Cheers, Dave
From: Antoine Pitrou <solipsis at pitrou.net>
Ok, this is a good opportunity to try to sum up, from my point of view. The main problem of the old GIL, which was evidenced in Dave's original study (not this year's, but the previous one) is fixed unless someone demonstrates otherwise. It should be noted that witnessing a slight performance degradation on a multi-core machine is not enough to demonstrate such a thing. The degradation could be caused by other factors, such as thread migration, bad OS behaviour, or even locking peculiarities in your own application, which are not related to the GIL. A good test is whether performance improves if you play with sys.setswitchinterval().
Dave's newer study regards another issue, which I must stress is also present in the old GIL algorithm, and therefore must have affected, if it is serious, real-world applications in 2.x. And indeed, the test I recently added to ccbench evidences the huge drop in socket I/Os per second when there's a background CPU thread; this test exercises the same situation as Dave's demos, only with a less trivial CPU workload: == CPython 2.7b2+.0 (trunk:81274M) == == x8664 Linux on 'x8664' == --- I/O bandwidth --- Background CPU task: Pi calculation (Python) CPU threads=0: 23034.5 packets/s. CPU threads=1: 6.4 ( 0 %) CPU threads=2: 15.7 ( 0 %) CPU threads=3: 13.9 ( 0 %) CPU threads=4: 20.8 ( 0 %) (note: I've just changed my desktop machine, so these figures are different from what I've posted weeks or months ago) Regardless of the fact that apparently noone reported it in real-world conditions, we could decide that the issue needs fixing. If we decide so, Nir's approach is the most rigorous one: it tries to fix the problem thoroughly, rather than graft an additional heuristic. Nir also has tested his patch on a variety of machines, more so than Dave and I did with our own patches; he is obviously willing to go forward. Right now, there are two problems with Nir's proposal: - first, what Nick said: the difficulty of having reliable high-precision cross-platform time sources, which are necessary for the BFS algorithm. Ironically, timestamp counters have their own problems on multi-core machines (they can go out of sync between CPUs). gettimeofday() and clockgettime() may be precise enough on most Unices, though. - second, the BFS algorithm is not that well-studied, since AFAIK it was refused for inclusion in the Linux kernel; someone in the python-dev community would therefore have to make sense of, and evaluate, its heuristic. I also don't consider my own patch a very satisfactory "solution", although it has the reassuring quality of being simple and short (and easy to revert!). That said, most of us are programmers and we love to invent ways of fixing technical issues. It sometimes leads us to consider some things issues even when they are mostly theoretical. This is why I am lukewarm on this. I think interested people should focus on real-world testing (rather than Dave and I's synthetic tests) of the new GIL, with or without the various patches, and share the results. Otherwise, Dj Gilcrease's suggestion of waiting for third-party reports is also a very good one. Regards Antoine.
- Previous message: [Python-Dev] Unordered tuples/lists
- Next message: [Python-Dev] Summing up
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]