What is spaghetti code?
This article is a repost of Michael O’Church’s post. He points out some of the issues with the so-called “spaghetti code” and the solutions that have come into play to reduce it. As always, he presents a very well thought out argument and a more well rounded answer to what is a rather difficult subject in the world of programming.
One of the easiest ways for an epithet to lose its value is for it to become over-broad, which causes it to mean little more than “I don’t like this”. Case in point is the term,“spaghetti code”, which people often use interchangeably with “bad code”. The problem is that not all bad code is spaghetti code. Spaghetti code is an especially virulent but specific kind of bad code, and its particular badness is instructive in how we develop software. Why? Because individual people rarely write spaghetti code on their own. Rather, certain styles of development process make it increasingly common as time passes. In order to assess this, it’s important first to address the original context in which “spaghetti code” was defined: the dreaded (and mostly archaic) goto statement.
The goto statement is a simple and powerful control flow mechanism: jump to another point in the code. It’s what a compiled assembly program actually does in order to transfer control, even if the source code is written using more modern structures like loops and functions. Using goto, one can implement whatever control flows one needs. We also generally agree, in 2012, that goto is flat-out inappropriate for source code in most modern programs. Exceptions to this policy exist, but they’re extremely rare. Most modern languages don’t even have it.
Goto statements can make it difficult to reason about code, because if control can bounce about a program, one cannot make guarantees about what state a program is in when it executes a specific piece of code. Goto-based programs can’t easily be broken down into component pieces, because any point in the code can be wormholed to any other. Instead, they devolve into an “everything is everywhere” mess where to understand a piece of the program requires understanding all of it, and the latter becomes flat-out impossible for large programs. Hence the comparison to spaghetti, where following one thread (or noodle) often involves navigating through a large tangle of pasta. You can’t look at a bowl of noodles and see which end connects to which. You’d have to laboriously untangle it.
Spaghetti code is code where “everything is everywhere”, and in which answering simple questions such as (a) where a certain piece of functionality is implemented, (b) determining where an object is instantiated and how to create it, and (c) assessing a critical section for correctness, just to name a few examples of questions one might want to ask about code, require understanding the whole program, because of the relentless pinging about the source code that answer simple questions requires. It’s code that is incomprehensible unless one has the discipline to follow each noodle through from one side to the other. That is spaghetti code.
What makes spaghetti code dangerous is that it, unlike other species of bad code, seems to be a common byproduct of software entropy. If code is properly modular but some modules are of low quality, people will fix the bad components if those are important to them. Bad or failed or buggy or slow implementations can be replaced with correct ones while using the same interface. It’s also, frankly, just much easier to define correctness (which one must do in order to have a firm sense of what “a bug” is) over small, independent functions than over a giant codeball designed to do too much stuff. Spaghetti code is evil because (a) it’s a very common subcase of bad code, (b) it’s almost impossible to fix without causing changes in functionality, which will be treated as breakage if people depend on the old behavior (potentially by abusing “sleep” methods, thus letting a performance improvement cause seemingly unrelated bugs!) and (c) it seems, for reasons I’ll get to later, not to be preventable through typical review processes.
The reason I consider it important to differentiate spaghetti code from the superset, “bad code”, is that I think a lot of what makes “bad code” is subjective. A lot of the conflict and flat-out incivility in software collaboration (or the lack thereof) seems to result from the predominantly male tendency to lash out in the face of unskilled creativity (or a perception of such, and in code this is often an extremely biased perception): to beat the pretender to alpha status so badly that he stops pestering us with his incompetent displays. The problem with this behavior pattern is that, well, it’s not useful and it rarely makes people better at what they’re trying to do. It’s just being a prick. There are also a lot of anal-retentive wankbaskets out there who define good and bad programmers based on cosmetic traits so that their definition of “good code” is “code that looks like I wrote it”. I feel like the spaghetti code problem is better-defined in scope than the larger but more subjective problem of “bad code”. We’ll never agree on tabs-versus-spaces, but we all know that spaghetti code is incomprehensible and useless. Moreover, as spaghetti code is an especially common and damaging case of bad code, assessing causes and preventions for this subtype may be generalizable to other categories of bad code.
People usually use “bad code” to mean “ugly code”, but if it’s possible to determine why a piece of code is bad and ugly, and to figure out a plausible fix, it’s already better than most spaghetti code. Spaghetti code is incomprehensible and often unfixable. If you know why you hate a piece of code, it’s already above spaghetti code in quality, since the latter is just featureless gibberish.
What causes spaghetti code? Goto statements were the leading cause of spaghetti code at one time, but goto has fallen so far out of favor that it’s a non-concern. Now the culprit is something else entirely: the modern bastardization of object-oriented programming. Inheritance is an especially bad culprit, and so is premature abstraction: using a parameterized generic with only one use case in mind, or adding unnecessary parameters. I recognize that this claim– that OOP as practiced is spaghetti code– is not a viewpoint without controversy. Nor was it without controversy, at one time, that goto was considered harmful.
One of the biggest problems in comparative software (that is, the art of comparing approaches, techniques, languages, or platforms) is that most comparisons focus on simple examples. At 20 lines of code, almost nothing shows its evilness, unless it’s contrived to be dastardly. A 20-line program written with goto will usually be quite comprehensible, and might even be easier to reason about than the same program written without goto. At 20 lines, a step-by-step instruction list with some explicit control transfer is a very natural way to envision a program. For a static program (i.e. a platonic form that need never be changed and incurs no maintenance) that can be read in one sitting, that might be a fine way to structure it. At 20,000 lines, the goto-driven program becomes incomprehensible. At 20,000 lines, the goto-driven program has been hacked and expanded and tweaked so many times that the original vision holding the thing together has vanished, and the fact that a program can be in a piece of code “from anywhere” means that to safely modify the code requires confidence quantified by “from everywhere”. Everything is everywhere. Not only does this make the code difficult to comprehend, but it means that every modification to the code is likely to make it worse, due to unforeseeable chained consequences. Over time, the software becomes “biological”, by which I mean that it develops behaviors that no one intended but that other software components may depend on in hidden ways.
Goto failed, as a programming language construct, because of these problems imposed by the unrestricted pinging about a program that it created. Less powerful, but therefore more specifically targeted, structures such as procedures, functions, and well-defined data structures came into favor. For the one case where people needed global control flow transfer (error handling) exceptions were developed. This was a progress from the extreme universality and abstraction of a goto-driven program to the concretion and specificity of pieces (such as procedures) solving specific problems. In unstructured programming, you can write a Big Program that does all kinds of stuff, add features on a whim, and alter the flow of the thing as you wish. It doesn’t have to solve “a problem” (so pedestrian…) but it can be a meta-framework with an embedded interpreter! Structured programming encouraged people to factor their programs into specific pieces that solved single problems, and to make those solutions reusable when possible. It was a precursor of the Unix philosophy (do one thing and do it well) and functional programming (make it easy to define precise, mathematical semantics by eschewing global state).
Another thing I’ll say about goto is that it’s rarely needed as a language-level primitive. One could achieve the same effect using a while-loop, a “program counter” variable defined outside that loop that the loop either increments (step) or resets (goto) and a switch-case statement using it. This could, if one wished, be expanded into a giant program that runs as one such loop, but code like this is never written. What the fact that this is almost never done seems to indicate is that goto is rarely needed. Structured programming thereby points out the insanity of what one is doing when attempting severely non-local control flows.
Still, there was a time when abandoning goto was extremely controversial, and this structured programming idea seemed like faddish nonsense. The objection was: why use functions and procedures when goto is strictly more powerful?
Analogously, why use referentially transparent functions and immutable records when objects are strictly more powerful? An object, after all, can have a method called run or call or apply so it can be a function. It can also have static, constant fields only and be a record. But it can also do a lot more: it can have initializers and finalizers and open recursion and fifty methods if one so chooses. So what’s the fuss about this functional programming nonsense that expects people to build their programs out of things that are much less powerful, like records whose fields never change and whose classes contain no initialization magic?
The answer is that power is not always good. Power, in programming, often advantages the “writer” of code and not the reader, but maintenance (i.e. the need to read code) begins subjectively around 2000 lines or 6 weeks, and objectively once there is more than one developer on a project. On real systems, no one gets to be just a “writer” of code. We’re readers, of our own code and of that written by others. Unreadable code is just not acceptable, and only accepted because there is so much of it and because “best practices” object-oriented programming, as deployed at many software companies, seem to produce it. A more “powerful” abstraction is more general, and therefore less specific, and this means that it’s harder to determine exactly what it’s used for when one has to read the code using it. This is bad enough, but single-writer code usually remains fairly disciplined: the powerful abstraction might have 18 plausible uses, but only one of those is actually used. There’s a singular vision (although usually an undocumented one) that prevents the confusion. The danger sets in when others who are not aware of that vision have to modify the code. Often, their modifications are hacks that implicitly assume one of the other 17 use cases. This, naturally, leads to inconsistencies and those usually result in bugs. Unfortunately, people brought in to fix these bugs have even less clarity about the original vision behind the code, and their modifications are often equally hackish. Spot fixes may occur, but the overall quality of the code declines. This is the spaghettification process. No one ever sits down to write himself a bowl of spaghetti code. It happens through a gradual “stretching” process and there are almost always multiple developers responsible. In software, “slippery slopes” are real and the slippage can occur rapidly.
Object-oriented programming, originally designed to prevent spaghetti code, has become (through a “design pattern” ridden misunderstanding of it) one of the worst sources of it. An “object” can mix code and data freely and conform to any number of interfaces, while a class can be subclassed freely about the program. There’s a lot of power in object-oriented programming, and when used with discipline, it can be very effective. But most programmers don’t handle it well, and it seems to turn to spaghetti over time.
One of the problems with spaghetti code is that it forms incrementally, which makes it hard to catch in code review, because each change that leads to “spaghettification” seems, on balance, to be a net positive. The plus is that a change that a manager or customer “needs yesterday” gets in, and the drawback is what looks like a moderate amount of added complexity. Even in the Dark Ages of goto, no one ever sat down and said, “I’m going to write an incomprehensible program with 40 goto statements flowing into the same point.” The clutter accumulated gradually, while the program’s ownership transferred from one person to another. The same is true of object-oriented spaghetti. There’s no specific point of transition from an original clean design to incomprehensible spaghetti. It happens over time as people abuse the power of object-oriented programming to push through hacks that would make no sense to them if they understood the program they were modifying and if more specific (again, less powerful) abstractions were used. Of course, this also means that fault for spaghettification is everywhere and nowhere at the same time: any individual developer can make a convincing case that his changes weren’t the ones that caused the source code to go to hell. This is part of why large-program software shops (as opposed to small-program Unix philosophy environments) tend to have such vicious politics: no one knows who’s actually at fault for anything.
Incremental code review is great at catching the obvious bad practices, like mixing tabs and spaces, bad variable naming practices, and lines that are too long. That’s why the more cosmetic aspects of “bad code” are less interesting (using a definition of “interesting” synonymous with “worrisome”) than spaghetti code. We already know how to solve them in incremental code review. We can even configure our continuous-integration servers to reject such code. As for spaghetti code, where there is no clear definition, this is difficult if not impossible to do. Whole-program review is necessary to catch that, but I’ve seen very few companies willing to invest the time and political will necessary to have actionable whole-program reviews. Over the long term (10+ years) I think it’s next to impossible, except among teams writing life- or mission-critical software, to ensure this high level of discipline in perpetuity.
The answer, I think, is that Big Code just doesn’t work. Dynamic typing falls down in large programs, but static typing fails in a different way. The same is true of object-oriented programming, imperative programming, and to a lesser but still noticeable degree (manifest in the increasing number of threaded state parameters) in functional programming. The problem with “goto” wasn’t that goto was inherently evil, so much as that it allowed code to become Big Code very quickly (i.e. the threshold of incomprehensible “bigness” grew smaller). On the other hand, the frigid-earth reality of Big Code is that there’s “no silver bullet”. Large programs just become incomprehensible. Complexity and bigness aren’t “sometimes undesirable”. They’re always dangerous. Steve Yegge got this one right.
This is why I believe the Unix philosophy is inherently right: programs shouldn’t be vague, squishy things that grow in scope over time and are never really finished. A program should do one thing and do it well. If it becomes large and unwieldy, it’s refactored into pieces: libraries and scripts and compiled executables and data. Ambitious software projects shouldn’t be structured as all-or-nothing single programs, because every programming paradigm and toolset breaks down horribly on those. Instead, such projects should be structured as systems and given the respect typically given to such. This means that attention is paid to fault-tolerance, interchangeability of parts, and communication protocols. It requires more discipline than the haphazard sprawl of big-program development, but it’s worth it. In addition to the obvious advantages inherent in cleaner, more usable code, another benefit is that people actually read code, rather than hacking it as-needed and without understanding what they’re doing. This means that they get better as developers over time, and code quality gets better in the long run.
Ironically, object-oriented programming was originally intended to encourage something looking like small-program development. The original vision behind object-oriented programming was not that people should go and write enormous, complex objects, but that they should use object-oriented discipline when complexity is inevitable. An example of success in this arena is in databases. People demand so much of relational databases in terms of transactional integrity, durability, availability, concurrency and performance that complexity is outright necessary. Databases are complex beasts, and I’ll comment that it has taken the computing world literally decades to get them decent, even with enormous financial incentives to do so. But while a database can be (by necessity) complex, the interface to one (SQL) is much simpler. You don’t usually tell a database what search strategy to use; you write a declarative SELECT statement (describing what the user wants, not how to get it) and let the query optimizer take care of it.
Databases, I’ll note, are somewhat of an exception to my dislike of Big Code. Their complexity is well-understood as necessary, and there are people willing to devote their careers entirely to mastering it. But people should not have to devote their careers to understanding a typical business application. And they won’t. They’ll leave, accelerating the slide into spaghettification as the code changes hands.
Why Big Code? Why does it exist, in spite of its pitfalls? And why do programmers so quickly break out the object-oriented toolset without asking first if the power and complexity are needed? I think there are several reasons. One is laziness: people would rather learn one set of general-purpose abstractions than study the specific ones and when they are appropriate. Why should anyone learn about linked lists and arrays and all those weird tree structures when we already have ArrayList? Why learn how to program using referentially transparent functions when objects can do the trick (and so much more)? Why learn how to use the command line when modern IDEs can protect you from ever seeing the damn thing? Why learn more than one language when Java is already Turing-complete? Big Code comes from a similar attitude: why break a program down into small modules when modern compilers can easily handle hundreds of thousands of lines of code? Computers don’t care if they’re forced to contend with Big Code, so why should we?
However, more to the point of this, I think, is hubris with a smattering of greed. Big Code comes from a belief that a programming project will be so important and successful that people will just swallow the complexity– the idea that one’s own DSL is going to be as monumental as C or SQL. It also comes from a lack of willingness to declare a problem solved and a program finished even when the meaningful work is complete. It also comes from a misconception about what programming is. Rather than existing to solve well-defined problems and then get out of the way, as small-program methodology programs do, they become more than that. Big Code projects often have an overarching and usually impractical “vision” that involves generating software for software’s sake. This becomes a mess, because “vision” in a corporate environment is usually bike-shedding that quickly becomes political. Big Code programs always reflect the political environment that generated them (Conway’s Law) and this means that they invariably look more like collections of parochialisms and inside humor than the more universal languages of mathematics and computer science.
There is another problem in play. Managers love Big Code, because when the programmer-to-program relationship is many-to-one instead of one-to-many, efforts can be tracked and “headcount” can be allocated. Small-program methodology is superior, but it requires trusting the programmers to allocate their time appropriately to more than one problem, and most executive tyrannosaurs aren’t comfortable doing that. Big Code doesn’t actually work, but it gives managers a sense of control over the allocation of technical effort. It also plays into the conflation of bigness and success that managers often make (cf. the interview question for executives, “How many direct reports did you have?”) The long-term spaghettification that results from Big Code is rarely an issue for such managers. They can’t see it happen, and they’re typically promoted away from the project before this becomes an issue.
In sum, spaghetti code is bad code, but not all bad code is spaghetti. Spaghetti is a byproduct of industrial programming that is usually, but not always, an entropic result of too many hands passing over code, and an inevitable outcome of large-program methodologies and the bastardization of “object-oriented programming” that has emerged out of these defective, executive-friendly processes. The antidote to spaghetti is an aggressive and proactive refactoring effort focused on keeping programs small, effective, clean in source code, and most of all, coherent.