Black Boxes and Magic

01 Oct 2019

Black Boxes and Magic

I recently read an article online titled, First, Do No Harm: A Hippocratic Oath for Software Developers. I recommend reading this article; it’s fantastic. I’d like to focus on one aspect that really stood out to me, though.

In it, the author talks about the nature of abstractions as a synonym for reusable code. While not specifically dismissing them as useless, he raises an excellent point, one which I myself have been rather vociferously accused of being an inadequate developer for supporting. With respect to reusable software’s increasing tendency to seem like “black boxes and magic”, the author writes, “I am not a Luddite, but my fear – based on observations of hundreds of practitioners – is that we adopt these aforementioned technologies without fully understanding what they do or how they do it.”

Now, I am a nobody. I’m a college drop-out, uncredentialed in just about anything except having a drivers license, and just kind of float job to job. The author, however, is a Ph.D. at a prestigous university with books and articles out the wazoo to his name. Yet, we’ve been both saying the same thing, him from observing literally hundreds of other practitioners, and myself from personal, first-hand experience. So, maybe just this once, give what I have to say in this article a little credence before accusing me of inadequacy?

I go even further and suggest that this age of abstractions proactively discourages a practitioner from understanding them. How can you successfully apply an abstraction if you don’t have any intuition about its applicability in the first place? That is, after all, what understanding is all about.

One fallacious analogy people often make is, “You don’t need to understand how a car works to drive a car.” This is a fantastic metaphor for abstractions and black boxes, because unknown to those who frequently resort to it, it works both ways. Yes, the statement is entirely correct; however, if your car blows a tire or it starts leaking oil, you’ll be right screwed. This is the automotive equivalent of having some on-call IT developer’s pager that goes off at 3AM exactly 24 hours into their weekend. Do you know enough to limp the car to the nearest garage? Or, will you just sit in the middle of rush-hour traffic (if you’re lucky enough to have this happen in a city with decent infrastructure!), waiting for a response to your call to 2nd-level support/your nearest towing company? If you’re a programmer, in any capacity, I’d expect you to have some facility with how a computer actually works, how the software stack that sits on top of it works, and the resulting interactions between the two. Just as if you’re planning on going on a road trip out into the desert, I expect you to carry enough provisions not just for you, but for your car, to limp yourself to a garage if trouble happens. It’s just common sense. 24-hour road service doesn’t mean they’ll get to you right away; it only means you can call 24-hours a day to get help.

This is one of a couple of reasons why I prefer Forth as it is practiced by Chuck Moore – everything is in your face, and there is a very close correspondance between your source code and what gets executed by the computer. Even when writing code that relies heavily upon abstractions and what looks like black magic, you can search through the source code to find the meaning of a definition, study it, and create your own if it’s not suitable for your needs. You can reliably study the existing code to learn why it doesn’t work, and inform your superiors not only why things are broken, but how long it’ll take to fix (assuming you have that experience, of course). In fact, writing your own implementation of everything you come to depend upon is encouraged by Chuck Moore, from a simple multiplication routine to your own development tools (possibly including Forth itself!), depending upon context of course. As a result, I think some people early in the Forth community’s history have misunderstood Moore by taking his suggestions too literally, and let the pendulum swing too far the other way of black boxes and magic, to the point where writing your own Forth is seen as something of a rite of passage. I know, because I went through this phase myself. I now know that this is not at all what Moore was trying to get at. Years later, Moore lamented publicly that too many people are playing games with their Forth implementations and are not busy writing real-world applications.

The point Moore was trying to make is so frequently missed, it’s almost embarrassing. It’s simply this: you’re a technician; and, like most technicians, you need a set of tools on which you can rely. By rely, I mean intuitively know what solution is the right solution for a given job. When is a hammer better than a mallet? When is a crescent wrench better than hex driver? In most cases, you can reasonably substitute one for the other; but, not always! You don’t always have to make your own hammer from scratch; but, if you’ve ever used a block of wood to diffuse the blow of a hammer, you basically re-invented your own, purpose built, task-optimized mallet. That kind of resourcefulness, based on your understanding of first principles, is the point Moore was trying to get at. His goal was to encourage the development and understanding of first principles.

You can’t do that with software unless you have a profound understanding of that very same software, making this a grotesque chicken and egg situation. That means not just having ready access to the source code, but having it in an easy to understand, easy to adjust form. If necessary, how-to-hack and how-this-thing-works documentation would be required. The value proposition of open source is severely undercut if the source listing to a component you depend heavily upon but need to change or better understand consists of tens of thousands of lines of code strewn about hundreds or thousands of source files with no statically determinable path of understanding how one module relates to another. If your editing experience requires more than 8 files open concurrently to understand an interface or to implement a new feature, you might want to ask yourself if there’s a better way of organizing the code you’re working on. I would even argue that four is a more reasonable limit to strive for.

In my opinion, informed from personal experience, this is why I tend to be more successful with code written in C and Forth than I am with code written in Python or Smalltalk; all the IDE black magic in the world falls on its face with a goopy splat! the moment you introduce heavy reliance upon polymorphism into the mix. Trying to figure out how programs which rely heavily upon polymorphism works is a nigh impossible task for me.

Moore is often proud of having written his own multiplication and square root programs in Forth. But, did he write this code for every single platform he developed on? Turns out, not exactly. He kept a personal library of all the reusable concepts and code he’d written himself over the years. With each new project, he’d contribute new code to it for future reference. If an earlier routine needed adjustment for his new target application and/or platform, then he would do so.

Forth isn’t just a language; it’s an entire way of writing software. It’s how we approach the profession; the entire mindset of the developer. These things simply cannot be meaningfully standardized, no matter how much ANSI or Forth 2K folks want it to be. I mean, despite the existence of these standards, the aphorism “If you’ve seen one Forth, you’ve seen one Forth” still very much applies. Worse, it frequently applies across different versions of the same product. GForth 0.4.0 32-bit is a very different creature from 0.7.0 64-bit. Meanwhile a good Forth programmer is at home in any Forth environment; not because he can depend on a standardized vocabulary (though that helps), but because he understands how the Forth environment as a tool works, and its relationship to the underlying machine. Yes, I might have to hand-type code in that might be reusable elsewhere instead of just linking against it. However, that takes much less time in practice than what’s spent literally shopping for the right solution on the Internet, reading its frequently incomplete and even wrong documentation on how to install it on your platform, writing some test code to familiarize yourself with its API, kicking off debuggers when things go horribly wrong (because what should have been a library is really a framework), etc. before you can successfully deploy it in staging, much less in production.