I’ve heard a few people say that the most productive programmers are about 100x more productive than the least, but that didn’t seem to ring true to me. The only times that I’ve seen someone taking five months to produce as much as someone else could do in one day, it was a management problem. The low-productivity person was either goofing off or (more commonly) they were working on the wrong thing.
In Joelonsoftware, Joel reports (using data from Professor Stanley Eisenstat at Yale) that he sees about a 3x or 4x difference in the time students put into an assignment (with no effect of grades). However — and this is a big “however” — Prof. Eisenstat depends upon self-reporting from the students. Self-reporting is notoriously inacurate.
I have found it surprisingly hard to find good academic literature that measures productivity. I finally found a 1966 study by Sackman that showed a 28x difference. That number apparently got widely circulated, despite some sloppiness in its methods (as refuted by Dickey in 1981). A response to the concern by Curtis (1981) gave data that showed of about 8x and 13x min:max for two problems if and only if you tossed out one outlier who got stuck and didn’t finish that problem, timing out instead. Keeping the outlier, the 13x changed to 22x.
The outlier made me realize that the difference between the “best” programmer and the “worst” programmer is the wrong measure. You can make that number arbitrarily large by finding the right person to compare against. For example, I will be infinitely more productive than an infant. I will be infinitely more productive than someone who sleeps at his/her desk for their entire workday every day.
I would be far more interested in the standard deviation, or perhaps the productivity difference between the median and the top performer. What is the performance hit if I hire people within two standard deviations of the mean instead of trying for the ones who are better than two sigma above the mean? (This should looked at in conjunction with the cost hit of hiring the absolute best.)
This graph shows the time that it took to complete some tasks, and the number of people who took that long. (One person took between 65 and 70 minutes to complete Task 2, for example.)
Using Curtis’ data, I get that the median coder takes 2x or 3x the time that the top coder does for each of Curtis’ two tasks. (It’s not exact because the data was bucketed into five-minute intervals.) Interestingly, this is the same order of magnitude that Prof. Eisenstat found.
Note, however, that Curtis gives measures for one task. Because of regression to the mean, the variation is likely to decrease if you look at a lot of tasks. (Curtis even mentions that the outlier on Task #2 did fine on other tasks.)
The shapes of the curves are interesting: they are skewed to the left, which makes sense: there are limits to how much faster you can get, but not to how much slower you can get. They are also bimodal. This is more intriguing and mysterious, especially since Saeed and Bornat found that grades in intro CS classes are bimodal.
While this data comes from 1981, it’s the best I could find so far. (Please tell me if you have better data!) It’s distressing that we don’t have lots of good data on this — it seems like such fundamental data! However, it is really hard to do good studies like this. Ideally, you’d want a large number of programmers to all work on the same large number of complex problems. Unfortunately, it’s very difficult to get a large number of programmers to spend a large number of hours on non-useful problems.
The big message that I take from this data for my personal use is don’t get stuck. That’s easier said than done, I know. But look at it this way: if I learn new tricks for squeezing out incremental gains — like memorizing all the keyboard shortcuts — that isn’t going to help me nearly as much as not getting stuck. Keyboard shortcuts might give take me from 10 minutes to 9 minutes, but getting stuck can easily take me from 10 minutes to 100 minutes.
Now all I have to do is figure out how to not get stuck.