Combing SQL Coverage Data unacceptably slow · Issue #761 · nedbat/coveragepy (original) (raw)

Describe the bug
Generating a combined coverage report is about 50-60 times slower with SQL-based coverage data than with JSON.

To Reproduce
coverage master with sql data coverage. You need a sizable set of coverage data.

Additional context
We are combining about 100-200 coverage files into one large one. The resulting SQL coverage file is about 390MB. It used to be able to combine those files in about 1 minute. Now it takes somewhere between 50-60 minutes.

The reason for the inefficiency is obvious: It combines files through the regular coverage measurement recording APIs, which are inappropriate to use here, since they cause one transaction per arc. SQL should be used in proper ways with do bulk reading and inserts.