02.15.07
Review: Robillard's How Effective Developers Investigate Source Code: An Exploratory Study
(See my previous programmer productivity article for some context.)
Martin Robillard did a study in conjunction with my advisor. In it, he had five programmers work on a relatively complex task for two hours. Two of the programmers finished in a little over an hour, one finished in 114 minutes, and two did not finish in two hours:
Robillard carefully looked at five subtasks that were part of doing the main task; there was a very sharp distinction between the three who finished and the two who did not. The two who didn’t finish only got one of the subtasks “right”. S for “Success” means everything worked. I for “Inelegant” means it worked but was kind of kludgy. B for “Buggy” means that there were cases where it didn’t work; U for “Unworkable” means that it usually didn’t work; NA for “Not Attempted” means they didn’t even try to do that subtask.
Coder ID | Time to finish | Check box | Disabling | Deletion | Recovery | State reset | Years exper. |
---|---|---|---|---|---|---|---|
#3 | 72 min | S! | S! | S! | S! | S! | 5 |
#2 | 62 min | S! | S! | S! | S! | B | 3 |
#4 | 114 min | I | S! | B | S! | B | 5 |
#1 | 125 min (timed out) | S! | U | U | U | NA | 1 |
#5 | 120 min (timed out) | S! | U | U | NA | NA | 1 |
Because coder #1 and coder #5 timed out, I don’t know how much of a conclusion I can draw from this data about what the range of time taken is. From this small sample size, it does look like experience matters.
This study did have some interesting observations:
- Everyone had to spend an hour looking at the code before they started making changes. Some spent this exploration period writing down what they were going to change, then followed that script during the coding phase. The ones who did were more successful than the ones who didn’t.
- The more successful coders (#2 and #3) spread their changes around as appropriate. The others tried to make all of the changes in one place.
- The more successful coders looked at more methods, and they were more directed about which ones they looked at. The second column in the table below is a ratio of the number of methods that they looked at via cross-references and keyword searching over the total number of methods that they looked at. The less-successful coders found their methods more frequently by scrolling, browsing, and returning to an already-open window.
Coder ID Number of methods examined intent-driven:total ratio Time to finish #3 34 31.7% 72 min #2 27.5 23.3% 62 min #4 27.5 30.8% 114 min #1 8.5 2.0% 125 min (timed out) #5 17.5 10.7% 120 min (timed out) - From limited data, they conclude that skimming the source isn’t very useful — that if you don’t know what you are looking for, you won’t notice it when it passes your eyeballs.