You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

The four best books for software engineers


This semester I’ve had the good fortune to be involved in the pilot of the Digital Problem Solving Initiative, a program being started out of the law school and the Berkman Center to allow Harvard students (from all schools) to get some mentored experience in dealing with various sorts of digital solutions. I’ve been leading a group of law students, FAS graduate students, and college students looking at some of the issues around the research data coming in from the HarvardX courses that are running on the edX platform. It’s a great group, and we have a nice combination of law questions (just what does FERPA require, anyway?), policy matters (how do we keep privacy in these sorts of data sets) and programming problems (they really think we can use *this* data?).

The group is a lot of fun, and we’ve had a stream of visitors as well to liven things up. A couple of weeks ago, Doc Searles and his wife joined us. Doc is one of the authors of the Cluetrain Manifesto, and I mentioned that it was one of the three or four books that all programmers should read. After the session, a couple of the participants asked what the others were.

Which got me to thinking. Most of the people who talk about the three or four books that everyone should read only tell you one of the books at any one time, and if you take the union of all those times the total is in the 20s or 30s. So trying to come up with all of them at once is difficult, and takes some thinking. It’s much easier to come up with one (the “desert island book”), or 20. But four?

But here we go…

First on the list would be Tracey Kidder’s The Soul of a New Machine. This is a non-fiction report of the development of a 32-bit minicomputer by Data General in the 1980s. But the reason to read a book about a long-forgotten machine with what is now a laughable computing capability by a company that no longer exists is to see how little the technology industry changes. The personalities, problems, and solutions described in this book are as relevant and real now as they were then. There are the same personalities (the architect, the team lead, marketing, management), the same situations (dispair, joy, post-shipping depression), and the same unvarying truths (the way to get the impossible done is to give the task to an intern, but make sure you don’t tell the intern it is impossible). I re-read this book every couple of years, and it still rings true.

Second on the list is Fred Brooke’s The Mythical Man Month. Talking about a machine that predates the one described in Kidder’s book, this book contains most of what is true about software engineering. This is the book that first enunciated the principle that adding people to a late project makes it later, but there is much more here. This is where I learned how to estimate how long a project will take, how to size a team, and so much else. The problem with this book is it is such a classic the everyone reads it early in their career (or education) and then forgets it. It should be re-read early and often. These truths don’t change, and the combination of this book and Kidder’s will remind you that high technology is still mostly about people (or, as I’m sometimes known to say, the technical problems are easy compared to the problems of primate behavior).

Third on the list is the aforementioned Cluetrain Manifesto. It packs a lot of truth about the new network world in a small number of patges, and is necessary reading for those who deal with the new internet world. A lot of what is said in this work is now common knowledge, so I sometimes worry that those reading it now won’t understand how radical it was when it was first published (and on the Web, no less). But other parts of the book are still not clearly understood, and are good to read.

My fourth book would be Kernighan and Ritchie, The C Programming Language, more generally known by its initials, K and R. This is the only overtly technical book on the list, and given how few people live their lives in the C language anymore, may seem an odd choice. Programmers shouldn’t read this book to learn the C language (although you can’t read it and not learn the C language). Programmers should read this book to understand how computers work, and to see an example of great writing about a technical subject.

Any computer language is a model for computation. COBOL models computation as a filing cabinet, allowing you to take something out of the cabinet, do some simple things with it, and put it back. Java models computation as a set of interacting objects (unless you use J2EE, in which the model is, well, more like a filing cabinet). LISP models computation as a set of composable functions.

C models computation as a computer; more precisely C models computation as a PDP-11. But this isn’t a bad model to learn, as most of our current computers are, at heart, PDP-11s. Learning C lets you understand what the computer is doing; it is the most useful assembly language you can learn.

But the main reason for reading K and R is to be exposed to the most elegant exposition of a programming language (and programming) I know. The elegance of the writing is so pervasive that you don’t even notice it (which is true elegance), but everything just makes sense. As a model for how to explain the complex, there is no better example.

And those are my four. Unlike most books in the software section of your bookstore (or Amazon), I’m reasonably confident that they will be the four I would pick in ten years. They have all not only aged well, but become better with age. Something we can all aspire to do, both in our work and our selves.


Privacy and Anonymity
WeCode and Visceral Education

Leave a Comment