[Python-Dev] Should standard library modules optimize for CPython? (original) (raw)

Stefan Behnel stefan_ml at behnel.de
Sun Jun 1 11:02:56 CEST 2014


Steven D'Aprano, 01.06.2014 10:11:

Briefly, I have a choice of algorithm for the median function in the statistics module. If I target CPython, I will use a naive but simple O(N log N) implementation based on sorting the list and returning the middle item. (That's what the module currently does.) But if I target PyPy, I will use an O(N) algorithm which knocks the socks off the naive version even for smaller lists. In CPython that's typically 2-5 times slower; in PyPy it's typically 3-8 times faster, and the bigger the data set the more the advantage.

For the specific details, see http://bugs.python.org/issue21592 My feeling is that the CPython standard library should be written for CPython, that is, it should stick to the current naive implementation of median, and if PyPy wants to speed the function up, they can provide their own version of the module.

Note that if you compile the module with Cython, CPython heavily benefits from the new implementation, too, by a factor of 2-5x. So there isn't really a reason to choose between two implementations because of the two runtimes, just use the new one for both and compile it for CPython. I added the necessary bits to the ticket.

Stefan



More information about the Python-Dev mailing list