lstat() dominates in the case of small coverage samples · Issue #625 · nedbat/coveragepy (original) (raw)

Skip to content

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sign up

Appearance settings

@nedbat

Description

@nedbat

Originally reported by Buck Evan (Bitbucket: bukzor, GitHub: bukzor)


The hypothesis library recently added coverage-led fuzzing, in which it needs to run a very short test many times, while examining the coverage between each trial. This (currently) involves many calls to coverage.Collector.save_data, which in turn causes many calls to realpath (and thus lstat). In the extreme case, lstat() ends up taking about 40% of the run time.

Can you please help me design a remedy? Some alternatives that I can think of:

  1. add a cache to files.abs_file
  2. replace the call to abs_file with a call to files.canonical_path, since canonical_path already has a cache
  3. Delegate the filename-normalization responsibility from Collector to CoverageData, such that we can specialize CoverageData and fix this within our dependent library.