[Python-Dev] explanations for more pybench slowdowns (original) (raw)

Guido van Rossum guido@digicool.com
Fri, 18 May 2001 17:58:25 -0400

Previous message: [Python-Dev] explanations for more pybench slowdowns
Next message: [Python-Dev] explanations for more pybench slowdowns
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

The scary thing about BuiltinFunctinoCalls is that the profiler shows it spending almost 30% of its time in PyArgParseTuple(). It certainly is a shame that we have this complicated, slow run-time parsing mechanism to deal with a static property of the code, namely how many arguments it takes and whether their types are.

I would love to see a mechanism whereby the signature of a C function could be stored as part of the static info about it, in an extension of the PyMethodDef structure: this would serve as documentation, allow for introspection, etc. I'm sure Ping would love this for pydoc and his inspect module.

But I'm not sure how much we can speed things up, unless we give up on the tuple interface (an argc/argv API could be much faster since usually the arguments are already on the frame's stack in this form).

A few of the other tests, SimpleComplexArithmetic and CreateStringsWithConcat, are slower because of the new coercion logic. I didn't spend much time on SimpleComplexArithmetic, but I did look at CreateStringsWithConcat in some detail. The basic problem is that "ab" + "cd" gets compiled to BINARYADD, which in turn calls PyNumberAdd("ab", "cd"). This function tries all sorts of different ways to coerce the strings into addable numbers before giving up and trying sequence concat.

It looks like the new coercion rules have optimized number ops at the expense of string ops. If you're writing programs with lots of numbers, you probably think that's peachy. If you're parsing HTML, perhaps you don't :-). I looked at the test suite to see how often it is called with non-number arguments. The answer is 77% of the time, but almost all of those calls are from testunicodedata. If that one test is excluded, the majority of the calls (~90%) are with numbers. But the majority of those calls just come from a few tests -- testpow, testlong, testmutants, teststrftime. If I were to do something about the coercions, I would see if there was a way to quickly determine that PyNumberAdd() ain't gonna have any luck. Then we could bail to things like stringconcat more quickly.

There's already a special case for int+int in the BINARY_ADD opcode (otherwise you would probably see more numbers). Maybe another special case for str+str would help here?

I also looked at SmallLists. It seems that the only significant change since 1.5.2 is the garbage collection. This tests spends a lot more time deallocating lists than it used to, and the only change I see in the code is the GC. I assume, but haven't checked, that the story is similar for SmallTuples.

So the primary things that have slowed down since 1.5.2 seem to be: comparisons, coercion, and memory management for containers. These also seem to be the things that have improved the most in terms of features, completeness, etc. Looks like we need to revisit them and sort out the performance issues.

Thanks for doing all this work, Jeremy!

I just hope that these performance hacks won't have to be redone when I'm done with healing the types/class split. I'm expecting that things can become a lot simpler if everything inherits from Object, sequences inherit from Sequence, and so on. But since I'm currently going slow on this work, I won't complain too much if the existing code gets optimized first. The stuff you already checked in looks good!

--Guido van Rossum (home page: http://www.python.org/~guido/)

Previous message: [Python-Dev] explanations for more pybench slowdowns
Next message: [Python-Dev] explanations for more pybench slowdowns
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]