The Software Sins of Bloat and Debt – Communications of the ACM (original) (raw)
In my computing ethics class, I introduce the question what ethical issues are faced by us, computer scientists, that are not faced by other professions; an answer would meet our definition of professional ethics. Most professions pledge to uphold the public good, and most professionals expect to take more trouble with their projects than is visible to outsiders, assuming a burden of quality for its own sake. These general standards, and some more specific, are outlined in the ACM Code, of course, but we might seek more essential challenges, essentials that distinguish our ethical obligations from others. Here’s the question: What are the measures taken, for a client, by a conscientious computer scientist, even though they may go uncredited?
In striving to isolate the unique ethical issues of the computing trades, we see that many are shared with engineering, some with business, and some spill over into social issues of all kinds. Business issues associated with computing take the forms of business models such as websites that charge for posting mug shots, and also for removing mug shots (scraped from public data). But those are not due to the nature of computing. Social issues associated with computing include cyberabuse and other forms of harassment and malicious actions, but those are not unique to the computing milieu.
What about engineering? We can construe programming as a form of engineering, as it calls for the disciplined construction of an object under constraints of both the methodology and the specification. And the prominent ethical mandate of such projects is safety. The resulting object must be, above all other criteria, safe for the public.
Let’s find the ethical manifestations of programming that affect public safety. Other engineering disciplines face the same general safety mandate, of course; the products and activities of engineering affect real lives. So what are our special responsibilities? I offer three issues of program quality familiar to experienced coders.
Accuracy
First is accuracy, as noted in a previous piece [Hill2019]. Although other engineering activities require accuracy, we produce accuracy. We certainly should, anyway; they use our output. This seems obvious. We perform computations, so any incorrect result (which appears from time to time) is our greatest failure.
Technical Debt
Code built on expedient short-term techniques instead of the best long-term approach holds technical debt. This includes ad hoc conditionals that are badly nested, confusing variable identifiers, and tortuous execution paths. (An earlier piece described this problem [Hill2017] under the term “software neglect”.) Technical debt is unlike other ethical violations—it’s fully detectable (no unpredictable materials or physics or weather), but unreasonably difficult to fix by humans. Entire conferences are devoted to this problem [ConfTechDebt].
Materials and structures in other engineering products may be suboptimal, but that circumstance will violate known standards. Program coding deemed collaborative in industry is still solitary and modular, so suboptimal choices are left to the programmer and often escape review, while testing to known standards may clear those code defects for distribution. In other words, the system works in all conceivable ways, which means that for all practical purposes, it works. Many practitioners, while unsure about the prevalence of the problem, recognize its danger [Recupito2024].
Code Bloat
We use this term informally to refer to the voluminous application programs promoted and sold by large software firms, but here we mean the act of inclusion of pre-written libraries with hundreds of code modules that incorporate functionality unnecessary to the current purpose [Hubert2024]. A developer, for example, might load a large library of search functions and call seqsearch() instead of writing a five-line sequential search in her own source code. While Hubert’s worry in the cited paper focuses on security—such libraries might inject malware—those library modules might also violate other quality standards. When software fails, the victims may not care whether the failure was malicious or accidental, OR under the intention of future rectification.
Other engineering disciplines might encounter this problem of extravagance due to expedience, but their materials are under the control of a budget, which limits this type of waste from reaching the scale easily attained in coding.
What makes these last two transgressions so troubling is that they are easily concealed, deliberately or not; they are issues of quality that are routinely not known to anyone but the coder. That’s a reflection of professional ethics; professionals are automomous, self-governing, thence must hold themselves to standards, rather than rely on external monitoring. Note that this instructor would like to include program documentation in this list, but views that sadly out-of-date expectation as a different topic.
Other engineering disciplines do not encounter these circumstances in the same way. Engineers meet publicly known standards for their constructions, as proxies for a complete guarantee of safety which is impossible to achieve. So no general tradition of normative defense against these hidden qualitative flaws has developed in the engineering profession. Reading the Engineer’s Creed and the Engineer’s Code of Ethics reveals no analogous concerns. (But engineers are invited to say differently!)
Hidden flaws in code may not manifest at all. It’s not a inevitability, but a risk that could emerge as production software executes, and pushes results into the world, and requires regular maintenance. Both technical debt and code bloat (not to mention inaccuracy) harbor future problems at a level of danger that we cannot measure and, significantly, both are threats to evaluative standards and their effective application.
We might ask whether code bloat is an ethical violation in itself, as the violation seems to be waste, and we might doubt whether waste is a problem in modern computing. To critique this practice beyond the risks of unknown code quality and security is an interesting challenge: Should we expect some sort of frugality on coding, as a virtue in itself? Or as a means of saving the energy burned by CPU cycles and cloud storage?
The larger lesson is that software is fallible, in ways that are not always apparent in the development and marketing of new paradigms and products, as well as ways that are well understood but wearisome. The effort required to rectify code bloat and technical debt is substantial. That such effort will be neglected, or that failure of AI methods to detect them will be regarded as precluding decent evaluative standards, is a serious danger.
References
[ConfTechDebt] IEEE and ACM. 2021. Proceedings of the 2021 IEEE/ACM International Conference on Technical Debt (TechDebt). IEEE, Curran Associates. Table of Contents at DOI: 10.1109/TechDebt52882.2021
[Hill2017] Robin K. Hill. 2017. The Ethical Problem of Software Neglect. BLOG@CACM, May 31, 2017.
[Hill2019] Robin K. Hill. 2019. Voting, Coding, and the Code. BLOG@CACM, November 27, 2019.
[Hubert] Bert Hubert. 2024. Why Bloat Is Still Software’s Biggest Vulnerability. IEEE Spectrum (2024).
[Recupito] David Sculley et al. 2015. Hidden technical debt in machine learning systems. Advances in Neural Information Processing Systems 28.
Robin K. Hill is a lecturer in the Department of Computer Science and an affiliate of both the Department of Philosophy and Religious Studies and the Wyoming Institute for Humanities Research at the University of Wyoming. She has been a member of ACM since 1978.
Submit an Article to CACM
CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.
You Just Read