I’ve been thinking recently about what I want to get out of work; and it seems like, these days, I’ll get the most out of work if I focus on what I personally like about the process, the details of working with code. Which, for me, translates into: paying attention to the shape and feel of the code, trying to write clean code while paying attention to what the code is telling me about what it wants to look like. Not that I don’t enjoy other aspects of programming—for example, when I’m at work, I certainly want to write software that other people will find useful, and I want to produce code as quickly as I can while still feeling proud of it. But, really, it’s the artistry of producing software that makes me actively happy; to me, those other aspects are (productive!) constraints towards that end. And, fortunately, my current work is a good place for me to seek that happiness: we have a pretty good code base, but one with enough quirks that there’s always something to think about in terms of improving it; and we also constantly have new challenges that provide concrete suggestions for where to look next.

I think what I probably mean by “trying to write clean code while paying attention to what the code is telling me about what it wants to look like” is basically Kent Beck’s four rules. It had actually been a little while since I’d looked at them, but they do a good job of putting to words that I’ve been striving towards. The first rule is basic hygiene that I’ve had ingrained for over a decade, and the fourth rule simply isn’t something that I spend too much time worrying about, but the middle two are great: if there’s an idea that’s latent in your code, then make it explicit and put it in one place.

Which I’ve had several really enjoyable case studies on recently, that have actually all ended up feeding into each other. I spent a fair amount of the fall working with one of my colleagues on scaling out some of our map-reduce machinery; and, when I was looking at the computational guts of that, it never felt quite right. We had a dozen or so concrete pairs of map-reduce classes; the interface that those classes conformed to was designed to allow them to be serialized uniformly, which was important, but looking at everything through that lens obscured the underlying internal state that each pair was using.

And then there was the relationship between that internal state and the input: in the simplest case, the input could get transformed into something of the same type as the internal state and then combined with the internal state in a sort of monoidal operation. (E.g. if you’re counting, then the internal state is an integer, and you take the input, replace it by 1, and add it to the internal state; or if you’re summing, then it looks exactly the same except that you leave the input alone instead of replacing it by 1.) And on the reducing side of things, we wanted to think of reducing in terms of combining internal states (which was, again, that same monoid idea) plus some sort of transformation of the internal state to the output. (Where that transformation was usually simple and frequently an identity map.) In fact, that combining was actually why we were looking at the details of this: we knew we wanted to do that for performance reasons, so the question was really whether we could do that in a way that increased the clarity of the code.


What I ended up doing was replacing a map / reduce interface that was mediated via a serialization type with the following:

  • An explicit notion of the internal state.
  • Functions to transform the internal state to and from the serialize state.
  • A factoring of reduce into combine composed with an internal state to output type function.
  • The above three were the general interface that I ended up with, but I also provided traits to help in cases where further patterns / special cases revealed themselves: ones where the structure arose from a monoid, ones where reduce and combine were identical.

And, I will say: doing that felt great. Some of that great feeling is probably the builder’s high that Rands recently wrote about it. But I think that’s actually missing the mark about why this really mattered to me: I didn’t feel that I was building so much that I was uncovering structure that was latent in the code. So, for me, it’s not so much the builder’s high as the scientist’s high, or the explorer’s high, or the mystic’s high. Building is good, but, for me, getting closer to underlying structure in the universe is better.


Happily, I can combine those two: that transformation I did was useful for the purpose of the scalability feature we were working on at the time, and it’s been useful in each of the two projects that I’ve worked at after that, so I really did build something. In the next one, we were bringing that map-reduce functionality over to a chunk of functionality that hadn’t used it before; as part of that, I had to extend it to handle another use case, which led to teasing out superclasses in a few cases, which I wouldn’t have been able to do nearly as easily if I hadn’t teased out responsibilities in the prior project. And, happily, that project itself led to a similar (albeit smaller) improvement in understanding of the underlying structures: there was one assumption that the prior code had been able to make, and I got to understand where that assumption was important and where that assumption was coincidental.

In the next project I worked on, I wanted to understand the memory usage of the data structures in question. Which was made much easier by my understanding of the types in question; but also, the evolution of that code over two or three hours had this wonderful flow. At first, I took one example, and tried to figure out what memory it used; I plugged it into a test and wrote a ridiculous 15-line comment explaining that. Then I poked around a bit and, after going to the bathroom a couple of times (too much information? I should probably write a blog post at some point on the role that the consumption of water plays in my programming methodology…), I finally gave into my brain’s suggestion that I should probably extract a class for doing that top-level memory estimation instead of sticking it in a method that belonged to the class that used the memory estimates.

And, once I’d done that, everything just flowed out in the most natural way possible. (Just as the water I’d been drinking flowed when, uh, never mind.) I TDD’d my way out from my original example, and in doing that, each line in that 15-line comment moved to the right place in the code: before I knew it, I had my overall memory estimator, a trait with memory-related comments, methods in the classes I was using that talked about how much memory they were using, and all of the information was where it belonged. (Plus, I fixed a mistake in my original memory estimate that became clear once I could look at all of the components of that estimate in the right context.)


Good times; I really appreciate being able to get this sort of experience out of my work. Or, really more than that: rewarding times, nourishing times.

Post Revisions:

This post has not been revised since publication.