[Python-Dev] Python startup time (original) (raw)

Terry Reedy tjreedy at udel.edu
Thu Jul 20 02:20:39 EDT 2017

Previous message (by thread): [Python-Dev] Python startup time
Next message (by thread): [Python-Dev] Python startup time
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 7/19/2017 10:05 AM, Nick Coghlan wrote:

P.S. I'll also note that we're not actually limited to resolving such conflicts in public venues (even though I think that's a good default habit for us to retain): as long as we report the outcome of any mutual agreements about design priorities back to the relevant public venue (e.g. a tracker issue), there's nothing wrong with shifting our attempts to better understand each other's perspectives to private email, IRC, video chat, etc.

I expect and hope that there will be discussion of this issue at the core developer sprint in September, with summary reports back here on pydev.

It can even make sense to reach out to other core devs for help, since it's almost always easier for someone not caught in the midst of an argument to see both sides of it, and potentially spot a core of agreement amidst various surface level disagreements :)

I always understood the Python development process, both for core and users, to be "Make it right; then make it faster", with the second clause conditioned on 'while keeping it right' and maybe, and especially for core development 'if significantly slow'. (People can rightly work on speed of personal code for other reasons.) I believe we pretty much agree on the principles. The disagreement seems to be on whether a particular case is 'significantly slow'. I believe that the burden of proof is with those who propose a change.

The burden of the proof depends on the final qualification: 'without adding unnecessary or extreme complexity'. If there is no added complication, the burden is slight. If not, we will likely disagree about complexity and its tradeoff with speed.

About 'keeping it right': It has been mentioned that more complicated code generally makes it harder to 'see' that the code is (basically) correct. The second line of defense is the automated test suite. I think, for instance, that someone interested in changing namedtuple (to a faster and presumably more complicated implementation) should check the coverage of the current code, with branches checked both ways. Then, bring the coverage up to 100% if is not already, and carefully check the test for possible missing cases.

A small static set of test cases cannot cover everything. The third test of an implementation is accumulated user experience. A new implementation starts at 0. One way to increase that is test the implementation with 3rd-part code. Another, I think, is through randomized testing.

Proposal 1: Depending on our confidence in a new implementation, simulate user experience with randomized tests, perhaps running for hours. Example: we develop a random (unicode) identifier generator that starts with any of the legal initial codepoints and continues with a random number of legal follow codepoints. Then test (old) and new namedtuple with random class and a random number of random field names. A developer could also use third-party packages, like hypothesis. Code and a summary could be uploaded to bpo. A summary could even go in the code file.

Note 1: Tim Peters did something like this when developing timsort. He provided a nice summary of test cases and time results.

Note 2: Randomized tests require that either a) randomized inputs are verified by property or predicate, rather than by hard-coded values, or b) inputs are generated from outputs, where either the output or inverse generation are randomized. Tests of sorting can use either is_sorted(list(sorted(random_input))) or list(sorted(random_shuffle(output))) == output.

Proposal 2: Add randomized tests here and there in the test suite. Each randomized test x 30 buildbots x 2 runs/day x 365 days/year is about 22000 random inputs a year. Since each buildbot would be running a slightly different test, we need to act on and not ignore sporadic failures. Victor Stinner's buildbot work is making this feasible.

-- Terry Jan Reedy

Previous message (by thread): [Python-Dev] Python startup time
Next message (by thread): [Python-Dev] Python startup time
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list