Add Submodules Fuzz Target by DaveLak · Pull Request #1919 · gitpython-developers/GitPython (original) (raw)
Please don't apologize, and definitely do not hesitate to reject or push back on any of my PRs! (especially considering that my last few PRs came out of the blue without prior discussion about whether they're even wanted -- sorry about that 😅)
Part of me thinks that the submodule implementation is so riddled with inaccuracies and and incorrectness that fuzzing it seems like a waste. The fuzzer can only try to find unexpected exceptions, and maybe that's a small win, but at what cost?
Part of that feeling also stems for the incredible sluggishness of Python in general, so any fuzzing feels wasteful.
I think your points are perfectly reasonable. Here is how I've been thinking of the value in fuzzing GitPython:
- Hopelessly broken or not, GitPython is widely used, so I tend to believe (perhaps too optimistically) that even small wins which result in improved stability can have an outsized impact for some
n
of users over time. Doing the wrong thing right is better than doing it wrong + unexpected behavior someone will eventually need to debug. That said, I fully respect and understand if you don't think the value justifies the CPU cycles, maintenance burden, or any other reason for that matter. - Similar to the above, I believe continuous fuzzing is well-positioned to identify regressions that traditional unit tests may miss. Even without a CI action run on PRs, the accumulated corpus in ClusterFuzz should be able to identify unexpected exceptions reasonably quickly after they're introduced. But, of course, that is just my hypothesis, which has yet to be validated.
- Finally, perhaps from a somewhat naive perspective, I believe that incorporating well-documented and effective fuzzing into GitPython has the potential to benefit the wider Python community. Now, I fully recognize how presumptuous that sounds but here me out. Historically, fuzzing has been used by security experts on lower-level languages with great success, but a lack of easy to use tooling, terse documentation, and few quality examples to emulate has hindered wider adoption in higher-level languages like Python. So I think integrating fuzz testing into a widely-used and complex Python project like GitPython, paired with some quality documentation, can lower the barrier for entry for folks that may be seeking to learn more or just happen to stumble across it (like I did lol.)
But that's besides the point I suppose, apologies for the ramblings.
I think everything you said is very much on-point regarding any of the fuzzing work in this repo. Moreover, I really appreciate hearing your thoughts, so thanks!
In case it isn't clear, I won't be offended if you feel the juice isn't worth the squeeze, and would rather me hold off on any non-maintenance type fuzzing work. Frankly, if you decided you'd rather it all removed ASAP, I'd help remove it. I've learned a lot about Git, Python, fuzzing, and more working on these, so I wouldn't consider it a wasted effort even if the changes never made it to PR, So thanks, @Byron, for the support along the way! 🙂
And now, it's my turn to apologize for the ramblings 😅