10 Technical Papers Every Programmer Should Read (At Least Twice)
Let me preface this post by saying that no programmer should feel compelled to read any of these papers. I list them because I think that they provide a breadth of information that is generally useful and interesting from a computer science perspective. What you do with that information is your prerogative, including ignoring it completely. Instead, learn what you think is important for what you need to accomplish your job, education, interests, etc.
Inspired by a fabulous post by Michael Feathers along a similar vein, I’ve composed this post as a sequel to the original. That is, while I agree almost wholly with Mr. Feather’s1 choices, I tend to think that his choices are design-oriented2 and/or philosophical. In no way, do I disparage that approach, instead I think that there is room for another list that is more technical in nature, but the question remains, where to go next? In this post I will offer some guidance based on my own readings. The papers chosen herein are not intended to act as a C.S. hall of fame, but instead hope to accomplish the following:
- All papers are freely available online (i.e. not pay-walled)
- They are technical (at times highly so)
- They cover a wide-range of topics
- The form the basis of knowledge that every great programmer should know, and may already
Because of these constraints I will have missed some great papers, but for the most part I think this list is solid. Please feel free to disagree and offer alternatives in the comments.
A Visionary Flood of Alcohol
Fundamental Concepts in Programming Languages (link to paper)
by Christopher Strachey
Quite possibly the most influential set of lecture notes in the history of computer science. Left and Right-values, Parametric and Ad-hoc polymorphism were all defined in this paper. Much of the content may already occupy your mind, but the sheer weight of the heady topics assembled in one place is stunning to observe.
Why Functional Programming Matters (link to paper)
by John Hughes
I found this paper extremely lucid on the advantages of functional programming with the added advantage of showing off examples of beautiful code. There are seemingly an infinite number of papers on the topic of laziness with streams and generators, but I’ve yet to find a better treatment. Finally, I’ve always been partial to Reginald Braithwaite’s “Why Why Functional Programming Matters Matters” as a complement to this paper.
An Axiomatic Basis for Computer Programming (link to paper)
by C. A. R. HOARE
I came to this paper late in my career, but when I finally found it I felt like I had been hit by a bus. At the core of the paper lies the following assertion:
P {Q} R
Taken to mean:
If the assertion P is true before initiation of a program Q, then the assertion R will be true on its completion
Where P
is a precondition, Q
is the execution of a program, and R
is the result.
In other words, as long as a program/function/method/etc. receives a set of parameters conforming to its preconditions, its execution is guaranteed to produce a well-formed result. This paper inspired me to explore contracts programming in Clojure, but the proof implications reached in Hoare’s paper run much deeper.
Time, Clocks, and the Ordering of Events in a Distributed System (link to paper)
by Leslie Lamport (1978)
Lamport has been highly influential in the field of distributed computation for a very long time and almost any of his papers on the subject should impress. However, this particular paper is likely his most influential and single-handed defined two branches of study in distributed computing since:
- The reasoning of event ordering in distributed systems and protocols
- The state machine approach to redundancy
The most amazing aspect of this paper is that after you read it you might think to yourself, “Well, of course that’s how it should work.” Jim Gray once said that this paper was both obvious and brilliant. I would say that there is no higher compliment.
On Understanding Types, Data Abstraction, and Polymorphism (link to paper)
by Luca Cardelli and Peter Wegner
I had originally thought to list Milner’s A Theory of Type Polymorphism in Programming, but thought that a survey paper would be better. I must admit that my own readings have not gone deep into the exploration of type systems, so any additional suggestions would be greatly appreciated.
Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I (link to paper)
by John McCarthy
It’s become a cliche to recommend McCarthy’s seminal paper introducing LISP. I will not count this toward the target of 10, but I would be remiss to excluse it because it’s a great read that is nicely supplemented with the study of a simple implementation of McCarthy’s original specification.3
The Machinery for Change
Predicate Dispatch: A Unified Theory of Dispatch (link to paper)
by Michael Ernst, Craig Kaplan, and Craig Chambers
Describes a method for dispatching functions based not on a static set of rules, but instead as the traversal of a decision tree that could be built at compile-time and extended incrementally at runtime. What this means is that dispatch is controlled and adapted based on an open set of conditions describing the rules of dispatch. This stands opposed to the current popular trend of languages whose dispatch is hard-coded and not open for extension at all.
Equal Rights for Functional Objects or, The More Things Change, The More They Are the Same (link to paper)
by Henry G. Baker
At the heart of Clojure and ClojureScript’s implementation is #equiv
that is in turn based off of Henry Baker’s egal
operator introduced in this paper. Briefly, equality in Clojure is defined by equality of value, which is facilitated by pervasive immutability. Equality in the presence of mutability has no meaning.
Organizing Programs Without Classes (link to paper)
by David Ungar, Craig Chambers, Bay-wei Chang, and Urs Hölzle
The greatest crime perpetrated in the name of JavaScript is the propensity for every framework, library, and trifle uses the prototypal inheritance capabilities of the language to implement class-based inheritance. I propose that this behavior stunts the power of JavaScript. However, the class-based mentality is pervasive, and is only likely to grow stronger as JavaScript moves toward “modernized” data-modeling techniques. Having said that, I love the prototypal model. It’s flexibility and simplicity is astounding, and this paper4 will show how it can be leveraged for practical purposes. While a design oriented paper, I think that the knowledge is contrary enough to pop-programming to warrant inclusion. Self is a fascinating language on its own merit, but especially in that its influence5 on modern dynamic languages is growing ever more pervassive.
I’ve Seen the Future, Brother: It is Murder
Dynamo: Amazon’s Highly Available Key-value Store6 (link to paper)
by Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels
It’s rare for a paper describing a system in active production to influence the state of research in any industry, and especially so in computing. Papers describing thought-stuff are pure and elegant while “real-world” systems tend to be ugly, hackish, and brutish, even if they are rock-solid otherwise. The case of Dynamo is quite different. That is, the system itself is based on simple principles and solves a hard problem, highly available and fault-tolerant online database storage, in an elegant way. Dynamo was not a new idea, but this paper is necessity as we move forward into the age of Big Data.
Out of the Tar Pit (link to paper)
by Ben Moseley and Peter Marks
Now we reach my favorite paper of the bunch – one that I try to read and absorb every 6 months (give or take). The gist is that the primary sources of complexity in our programs are caused by mutable state. With that as the premise, the authors build the idea of “functional relational programming” that espouses minimizing mutable state, moving whatever remains into relations, and then manipulating said relations using a declarative programming language. Simple right? Well, yes it is simple; and that’s what makes it so difficult.
This list should be a good start, but where to go next? My personal approach is summarized simply as: follow the bibliographies. If you like any of these papers then look at their bibliographies for other papers that sound interesting and read those too. Likewise, you can use services like Citeseer and the ACM Digital Library to backtrace citations.
Happy reading.
:F
-
Apart from a spectacular ear for music, Mr. Feathers is also a fount of wisdom, including this gem from the linked post:
When I first started writing, one of the pieces of advice that I heard was that you should always imagine that you are writing to a particular person.
-
Design is a vastly overloaded term, but it’s the best word that I can think of. Suggestions for something better? ↩
-
A have some ideas for a Lisp-centric essential papers list also, but have not yet formalized the content. ↩
-
It was difficult picking a paper from the treasure-trove that is the comprehensive list of Self publications. These papers represent the vanguard of performance in dynamic languages. ↩
-
Although Self does not hold a monopoly on dynamic performance revolutions. Smalltalk implementations have also driven innovation in said space, and a taste for this influence is found in Efficient Implementation of the Smalltalk-80 System by Peter Deutsch and The Design and Evaluation of a High Performance Smalltalk System by David Ungar. ↩
-
The Dynamo paper is probably the most controversial choice for this list, so if it bothers you then perhaps Software Transactional Memory by Nir Shavit and Dan Touitou would suffice as an alternative. I was bouncing back and forth between these choices and only settled on Dynamo in the spirit of controversy. ;-) ↩
38 Comments, Comment or Ping
Bootvis
The Cohen references make this list even better ;)
Sep 9th, 2011
Chris
Please please change your ipad theme. It’s horribly unusable. Slows the device to a crawl and breaks navigation.
Sep 9th, 2011
dml
Perhaps the first time I’ve ever seen Leonard Cohen invoked in a computer science context.
Sep 9th, 2011
Jon
As a programmer and devoted Leonard Cohen fan, I approve of this article!
Sep 9th, 2011
Joseph
Thanks for finding good free links for these!
Sep 9th, 2011
Karl
I believe that this occurred primarily because
Object.prototype.clone
in the Self/Io sense wasn’t in the language cloned by MS in IE.Object.create
could have worked if it was usable in 2005 but as part of v5 it’s too little too late.Sep 9th, 2011
Elijah Chancey
The Cohen references made me chuckle. If you’re a fan checkout San Francisco’s Conspiracy of Beards, a LC a Capella group.
Sep 9th, 2011
Luis
I can’t help to notice your Leonard Cohen citation, good articles and I agree that every programmer should at least be aware of those topics.
Sep 9th, 2011
fogus
@Karl
Thanks for the info. I look forward to exploring that further.
Sep 9th, 2011
wunki
Thank you for a great list! For those interested, I have created an archive containing all the papers above as PDF. You can find it here:
http://c.wunki.org/A2s0
Sep 10th, 2011
Rahul Pai
Good …. keep posting such stuffs
Sep 10th, 2011
Mike Schinkel
Nice list. But 10? Me counts 11…
Sep 10th, 2011
Pencilpenpen
Wunki, thanks for putting them all together!
Sep 10th, 2011
Su-Shee
I suggest adding the orginal “Model View Controller” paper to your list (considering how often it gets thrown around in all kinds of web-related articles..)
http://st-www.cs.illinois.edu/users/smarch/st-docs/mvc.html (updated version)
Sep 11th, 2011
fogus
@Su-Shee
Thank you for the link. It’s been a very long time since I’ve read that paper, so I can’t wait to read the updated version.
Sep 12th, 2011
Den
Best compilation of non-practical papers that pragmatic programmers should avoid!
Sep 16th, 2011
fogus
@Den
That’s the spirit — you tell me!
Sep 16th, 2011
SI Hayakawa
Re: “It’s flexibility and simplicity is astounding, and this paper will show how it can be leveraged for practical purposes.”
Should be “Its flexibility and simplicity are astounding…” — “Its” (possessive) and “are” (plural).
Oct 20th, 2011
mtraven
Re Lisp-based papers, the Lambda the Ultimate series is pretty important and well worth reading.
Also Richard Gabriel’s Worse is Better paper, and Alan Kay’s History of Smalltalk (both maybe more historical than belongs on the list, but well worth reading anyway).
Dec 22nd, 2011
Darren Grant
The link to Time, Clocks, and the Ordering of Events in a Distributed System should be http://research.microsoft.com/en-us/um/people/lamport/pubs/time-clocks.pdf
Dec 22nd, 2011
Bob Foster
Nonsense. A programmer should read everything Robert Tarjan ever published and disregard the rest.
Dec 23rd, 2011
Scott Hunter
Gries has articles, as well as entire book, on the Hoare work. I’ve always found that the idea of loop invariants to be quite powerful and useful (if only to make explicit the idea that SOMETHING about a loop has to be unchanging), and the way this technique can be used to tease out algorithmic efficiencies always makes for a good demo.
Dec 27th, 2011
cristian
I never would have thought Cohen references would find their way in a programming article. As an avid fan I am envious I didn’t think of this first! I shall try to make all of my code comments reference Cohen’s work, because code is poetry, right?
Aug 30th, 2012
John
Two other good papers from Google research:
The Chubby Lock Service for Loosely-Coupled Distributed Systems http://research.google.com/archive/chubby.html
Bigtable: A Distributed Storage System for Structured Data http://research.google.com/archive/bigtable.html
Sep 18th, 2012
Alia Henderson
Although one programmer has the necessary skills and knowledge to work competently on a problem or even create a program, he or she can only do so much. Creating the source code for an operating system, for example, will require thousands of manhours from a single programmer and most probably, he or she will only be halfway through. There just isn’t enough time for one or even two programmers to work effectively to produce a usable program.”..
May 3rd, 2013
model aircraft
Having read this I thought it was extremely enlightening.
I appreciate you finding the time and energy to put this information together. I once again find myself personally spending way too much time both reading and leaving comments. But so what, it was still worth it!
Jun 20th, 2013
matthias
I recommend you replace Cardelli-Wegner with Cardelli’s tech report on Typeful Programming. Nobody reads CW anymore in the types/PL community.
Jun 21st, 2013
Christopher D. Walborn
It looks like your link to Michael Feather’s list needs to be updated. I found this: http://michaelfeathers.tumblr.com/post/81489281/10-papers-every-programmer-should-read-at-least-twice
Aug 30th, 2013
Don
Nice list. I have some reading to do now. A couple of surprises here!
I totally expected Goldberg’s “What Every Computer Scientist Should Know About Floating-Point Arithmetic”. Or is that too cliche? :-)
And the Dynamo one looks OK, though I was a little surprised it wasn’t the paper about HP’s Dynamo project, which was more innovative. But maybe that’s just because I think virtual machines are more interesting than key-value stores.
Dec 27th, 2013
Keith Fullerton
In my 35+ years of real time programming, I found that recursion was fun to work with but better relegated to students and hobbyists. Due to its complicated structure, it is not a good choice for production code. In fact, I found that in general, simpler code worked faster and better (fewer bugs) and was far easier to maintain.
Feb 19th, 2014
fogus
… and working Erlang programmers.
Feb 20th, 2014
fogus
@KeithFullerton
I’m not sure where recursion fits into this post, but that’s OK. I’ll just say that I disagree.
BTW, I love your ambient music albums!
Feb 20th, 2014
mcwumbly
Link to the 2nd paper is broken because it used to have a typo. New link is here:
https://github.com/papers-we-love/papers-we-love/blob/master/functional_programming/why-functional-programming-matters.pdf
Dec 21st, 2014
In-Ho
I would add Delegation is Inheritance
Dec 29th, 2014
Hesham Shokry
I think you missed all of David Parnas’ work. He is the one wh proposed the Information Hiding principle!
Dec 26th, 2016
John
Thanks for this list.
“Organizing Programs Without Classes” is now a broken link. It can be found here instead http://bibliography.selflanguage.org/_static/organizing-programs.pdf
Jan 7th, 2018
fogus
@John
Thanks. Updated.
Jan 15th, 2018
Kirti Gupta
Most of the links provided above are broekn. Can someone look into this and provide the updated links.
Thanks!
Jan 3rd, 2020
Reply to “10 Technical Papers Every Programmer Should Read (At Least Twice)”