07.21.08
Posted in programmer productivity, Technology trends at 5:09 pm by ducky
I did a very quick, informal survey on how people use tabs when looking at Web search results. Some people immediately open all the search results that look interesting in new tabs, then explore them one by one (“open-parallel). Others open one result in a new tab, explore it, go back to the search page, then open the second result in another tab, etc. (“open-sequentially”). Note that the “open-sequential” people can have lots of tabs open at a time, they just open them one by one.
To clarify, open-parallel means control-clicking on the URL for result #1, then on result #2, then on #3, then on #4, and only THEN and switching to the tab for #1, examining it, switching to the tab for #2, etc. Open-sequential means control-clicking on the URL for result #1, switching to the tab for #1, examining #1, switching to the search results page, control-clicking on #2, switching to the tab for #2, examining #2, switching to the search results page, etc.
I was surprised to find that the people who had been in the US in the early 2000’s were far more likely to use the open-parallel strategy. There was an even stronger correlation with geekdom: all of the geeks used the open-parallel, and only two of the non-geeks did.
Open-parallel |
Open-sequential |
Citizenship |
Where in early 00’s? |
Geek? |
Citizenship |
Where in early 00’s? |
Geek? |
US |
Working/studying in US |
Y |
Canada |
Working/studying in Canada |
N |
US |
Working in US |
Y |
US |
Studying in US |
N |
US/Canada |
Studying in US |
Y |
Canada |
Studying in Canada |
N |
Canada |
Studying/working in US |
Y |
Canada |
Studying in US |
N |
Australia |
Working in Australia(?) |
Y |
Netherlands |
Working in Europe(?) |
N |
US/Canada |
Working in US |
Y |
Canada |
University in Canada |
N |
Canada |
Working in Canada |
sort-of |
Canada |
University in Canada |
N |
US |
University in US |
N |
India |
University in US |
N |
Notes on the survey:
- The subject pool is not representative of the general propulation: everyone who answered lives or lived at my former dorm at UBC, has a bachelor’s degree, and all but one have an advanced degree or are working on one.
- I classified people as geeks if they had had Linux on at least one of their computers and/or had worked in the IT industry. The person with the “sort-of” in the geek column doesn’t qualify on any of those counts, but was a minor Internet celebrity in the mid 90s.
What does this mean?
What does this mean? I’m not sure, but I have a few ideas:
- I suspect that the geeks are more likely to have used a browser with modern tabbing behaviour much earlier, so have had more years to adapt their strategies. (Internet Explorer got modern tabbing behaviour in 2006; Mozilla/Firefox got it in 2001.)
- One of the benefits of the open-parallel strategy is that pages can load in the background. Maybe in 2001, Web access was slower enough that this was important and relevant. Maybe it’s not that the geeks have been using tabs longer, but that they started using tabs when the Internet was slow.
- It might be that the geeks do more Web surfing than the non-geeks, so have spent more time refining their Internet behaviour.
Permalink
05.14.08
Posted in Eclipse, programmer productivity at 12:51 pm by ducky
Andrew Ko, whose papers I have mentioned before, observed in An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks that
When investigating [navigations back to previously-seen code fragments], nearly all seemed to be for the purpose of juxtaposing a set of code fragments while editing many dependent fragments. … Although Eclipse supports viewing multiple files side by side, placing any more than two files side by side would have incurred the interactive overhead of horizontal scrolling within each of the views.
Meanwhile, from my user study, I believe that the Package Explorer is usually not worth its screen real estate and the Outline View is probably harmful (which I will explain below). If I close the Package Explorer and the Outline View, on my 24″ monitor, I have oodles of room for two source files side-by-side. I think that is probably the way to go.
Package Explorer
Developers used the Package Explorer to find classes that had interesting strings in them. For example, in the Arrow task, they would do a “visual grep” to find classes that had the word “Arrow” or “Line” in them. Generally, however, they would explore the first class they found with “Arrow” or “Line” in it. While that happened to be okay in the Arrow task, it is easy to imagine situations where the first class could be uninteresting or even deceptive. I have come to the opinion that it is better in most cases to use Control-Shift-T to open the “Open Type” dialog, where you can enter in the string you are interested in. (Note that you can use wildcards, e.g. “*arrow” to find all Types that have the text “arrow” anywhere in their names.)
If you use the Open Type dialog, you will immediately (because it immediately displays matching type names as you type) have a very good idea of which / how many classes have that text in them. You are not going to miss one, and you are less likely to get distracted by the first one you see.
Now, I saw a very small number of cases where the Package Explorer was useful. For example, in the Output task, one guy looked at package names and guessed (successfully) that “proguard.io” might have something to do with IO. So I don’t think you should do away with the Package Explorer, but minimize it most of the time.
Outline View
As for the Outline View, you can search for strings in the method/field names using Control-O. Just like with the Open Type dialog, you can search for strings (including wildcards). Even better, with Control-O,Control-O, you can see what methods are inherited.
I don’t think I ever saw anyone wandering aimlessly (i.e. not doing a visual grep for a specific string) around the Outline View and having any good come of it. So I don’t think the Outline View is worth the screen real estate.
Permalink
05.06.08
Posted in Eclipse, programmer productivity at 4:06 pm by ducky
In previous posts, I talked about wanting a code-coverage tool to use while debugging, and how EclEmma did most of what I wanted. After my user study, where I dissected in detail how seven professional programmers did four twenty-minute programming tasks, I am even more convinced that a wonder tool would be highly, highly useful to people. I want the wonder tool to colour and present source based on whether code was executed between a user-defined start and stop point.
- As I mentioned in the first post, I’d like to be able to start and stop collecting code coverage information. It would be nice if I could set START LOGGING and STOP LOGGING breakpoints, so that lines of source would be coloured based on whether they were executed inside the “active” regions. Being able to do this would mean that I could greatly narrow down what code was involved in a particular feature, and not have all the code touched by initialization routines touched.
- I’d like the number of times that a method gets executed to influence the Mylyn degree of interest. I’d like Mylyn to restrict what classes/methods Eclipse showed me (initially) to only those which were executed between the START LOGGING and STOP LOGGING breakpoints.
For example, I could bring up a GUI application, hit the Suspend button, set a START LOGGING breakpoint, hit resume, and resize a window, and hit the STOP LOGGING button. The PackageExplorer would then only show me the methods involved in event handling and resizing a window. Most of the event-handling code would probably be done by .class files, so it probably would only be the code related to event handling.
I believe that this would have helped my user study subjects enormously. They overwhelmingly used static tracing (i.e. jumping to references or declarations of selected methods) to find where things happened in the code, but they frequently got caught by differences between what static tracing told them and what actually happened. Examples:
- All seven subjects searched for an unusual string related to the given task in a GUI, which seemed perfectly reasonable. However, that string appeared in a superclass of the object that was actually instantiated. Jumping to declarations (via F4 or hyperlinks) took them preferentially to methods in the superclass. Four of the seven subjects never noticed that there were interesting subclasses. If, however, they had been using my wonder tool, the first time they jumped to a method that the subclass overrode, they would have seen from the source code colouring that that method never got hit.
- In one task, important code was not broken, but flat-out missing. Subjects usually found the right method pretty quickly, but then assumed that — because that method didn’t have the needed code — that it didn’t get executed. “Oh, here’s where the start-of-line decoration code is, now I just have to figure out where the end-of-line decoration code is.” Had they seen that that method got executed, they might have realized faster that the end-of-line decoration code was simply missing.
- In another task related to incorrect output, to statically trace through to bug’s location meant going through six methods whose names and comments made it look like they were related only to input. Setting a START LOGGING breakpoint where it wrote “Writing output…” to stdout and a STOP LOGGING breakpoint at the next stdout status message, then the subjects would have been able to see that the “input” methods were involved in writing the output, so they might have been willing to push through them.
- One task that asked the subjects to do something when a pane got resized was so difficult that nobody solved it. In addition to problems with language (Is that pane a Window, a View, a Drawing, a Pane, or a Panel?), it can be tricky to get from the place where a GUI widget gets set up to where the event gets handled. (In this case, they needed to find what methods were executed when a pane was resized.)
With my wonder tool, subjects would be able to get a list of all the classes and methods that were involved in the resize. By browsing that list, they would have a better shot at finding a method that they could use.
I think it wouldn’t be too difficult to implement this — EclEmma has already done the lion’s share of the work — but I doubt it’s compatible with finishing my thesis. 🙁
Permalink
03.28.08
Posted in programmer productivity at 9:35 am by ducky
I finally got my hands on the dead-trees (i.e. uncorrupted) version of the Klahr/Dunbar article that I posted about earlier, and it didn’t say anywhere how long people in the hypothesizing group spent on coming up with hypotheses. However, I was able to track down David Klahr, and emailed to ask him how long they hypothesizing group spent hypothesizing. He graciously and quickly replied that it was only a few minutes. So if we make a wild guess that “a few” works out to an average of about four minutes, then the hypothesizing group took an average of about 10.2 minutes, while the non-hypothesizing group took an average of 19.4 minutes — so the hypothesizing group is still twice as fast as the non-hypothesizing group.
Note that the experimental conditions between the two groups were not exactly the same: the non-hypothesizing (slow) group all had at least one programming course, and pushed the buttons on the toy’s controller themselves. In the hypothesizing (fast) group, only half of them had prior programming experience. If those make a difference, I would have guessed that both of those differences would make the hypothesizing group slower than the non-hypothesizing group, but instead that group was twice as fast.
Wow.
Permalink
03.12.08
Posted in Eclipse, programmer productivity, robobait at 10:38 pm by ducky
Cool!
I am doing a observational study of people using Eclipse, and I saw one of the users doing something that looked really useful.
If you hover over a Java element (type, method, or field), then you see the Javadocs for that element. Yeah, old news. You do that seventy times per day already.
If you shift-hover over a Java element, then it shows you the method’s source intead of the Javadocs!
This is a great way to get a peek down a navigation path before commiting to it.
Permalink
02.11.08
Posted in programmer productivity at 11:46 pm by ducky
In this post, I introduced the idea of writing down three hypotheses for what a bug might be. In this recent post, I theorized that writing them down might reduce confirmation bias. This was all just conjecture, however.
I just discovered some confirmation in Klahr, D. and K. Dunbar (1988). Dual space search during scientific reasoning. Cognitive Science 12, 1-55. They did one experiment where they gave people a programmable toy and asked them to figure out the semantics of one of the programming keys. They then did a second experiment where they asked a new set of subjects to come up with as many hypotheses as possible for what the key did, and then gave them the toy and asked them to figure out the semantics of that key. The subjects were much better/faster at figuring it out after verbalizing a bunch of hypotheses than without verbalizing. It took the first set more than three times as long to figure it out than the second set.
Now, I don’t think this paper says how long people spent on coming up with hypotheses — all the PDFs I found of this paper were missing pages — but it would seem odd if it was more than the time saved and they wouldn’t mention it.
Permalink
02.08.08
Posted in programmer productivity, robobait at 12:34 pm by ducky
Frequently, I use the robobait tag to say, “this is something that I know right now that I want to be able to find again in the future”. This robobait tag, by contrast means, “This is something that I think more people should know about. Yo, search engines! Give this page a higher ranking!”
Evan Robinson wrote an article about output as a function of hours per week that draws upon historical research. It says that working a lot is a bad idea.
Permalink
02.06.08
Posted in programmer productivity at 9:27 pm by ducky
This is a somewhat expanded version the talk I’m giving to the Vancouver Software Developers’ Network tomorrow. I don’t think there’s is much that I haven’t said in previous blog postings, but I wanted to gather it all in one place.
“What is the variability between programmers?” is question I was curious about when I started my MS CS at UBC. In Silicon Valley, I’d heard a rule of thumb that there was a 100-to-1 difference in programmer productivity. My husband had heard ten-to-one; Joel on Software quotes one of his old profs to suggest that it is ten-to-one or larger.
Here is a drawing (deliberately crude, so nobody would think it represents actual data) of what I thought the histogram would look like of the number of programmers that finish a task in a certain time on the y-axis, and the time it takes on the x-axis.

The first problem here is that this just measures time, not quality. It is presumably faster to write lousy code than to write code that is clean, easy to read, easy to maintain, etc. That is a legitimate issue, but unfortunately my reading of peer-reviewed papers has not convinced me that anybody really knows how to measure quality.
There are some people who have measured coding speed, however. I have reported previously on experiments by Demarco and Lister, Dickey, Sachman, Curtis, and Ko which measure the time for a number of programmers to do a task. What I found is that the time histogram curve actually looks like this:

Observations:
- The worst programmer isn’t 100 or 10 times slower than the best, the worst programmer — found at the end of a very long tail — is infinitely worse. If you think about it, I am going to be infinitely faser than a walrus. It is hard to program with those flippers. (I actually worked on a project with a contractor who, after a year, was let go because despite repeated requests, he had not checked in a single line of his code. I think that counts as infinitely slow.)
- The median programmer is about two to four times slower than the fastest on single tasks. (Because of regression to the mean, this advantage should get smaller with many tasks.)
- The curve is wickedly shifted to the left. This makes sense: there isn’t much you can do to get faster, but a LOT of things you can do to get slower. (Not ever check in your code, for example.)
What implications does this curve have?
- Don’t spend a lot of effort to hiring the absolute best; spend lots of effort to avoid hiring losers.
- Don’t spend a lot of effort to learning how to type faster; spend lots of effort to figure out how to avoid getting stuck.
“Don’t get stuck” is easier said than done, of course, but there are things you can do.
- When you have a question — e.g. “Why is Foo set to 3 instead of 5?” — write down three hypotheses for what the answer might be. This can help you avoid confirmation bias. I came up with this idea after reading that breadth-first-ish approaches to problems are more successful than more depth-first-ish searches. I don’t have any research on it, but writing three hypotheses helps me a lot.
- Explain your question/problem to someone. It doesn’t even need to be someone who knows anything about coding. While there is little academic research on verbalizing, there are lots of anecdotes on it being helpful to verbalize. Anecdotes say that “rubber ducking” is useful, and that has been my experience as well. Verbalizing might also be part of why I find writing down three hypotheses so useful.
- Ask for help! Someone familiar with the particular area that you are investigating might be harder to find than a rubber duck, but sometimes can be more useful.
- Note to managers: your new hires will probably feel uncomfortable interrupting you to ask for help. Instead of making them come interrupt you, go to them. Once or twice per day, stop by their office and spend some time with them. Ask to see what they are doing; pair program with them for a little while. When they are new, you are more likely to catch them being stuck than not being stuck, so you can proactively un-stick them. Even if they are not stuck, you can still probably give good pointers on tools and techniques.
- Use tools to help you find the answers to your questions! There are all kinds of great tools available now that can help you answer questions.
- Omniscient debuggers: Debuggers like odb and undoDB keep track of every variable’s state change and then let you trace backwards to where that variable changed state. (Note: Cisco also made an Eclipse plugin for omniscient debugging in C++, but for internal use only.)
- Many code coverage tool will also color lines based on whether they were executed or not. This is a cheap way to see which execution paths were taken! Examples include Visual Studio, the Intel C++ Code Coverage Tool, and the Eclipse plug-in EclEmma.
- One of the questions that I frequently ask is, “How do I get information from class Foo over to class Bar?” Prospector and Strathcona can help with that. Strathcona looks for examples of existing code in your code base that gets you from Foo to Bar; Prospector looks for existing code, and also traverses the tree of classes that can be reached from a given class to answer that question.
- Use tools to keep you from having to get stuck in the first place.
- Findbugs looks for code that “looks funny” and which is likely to have errors in it.
- JML allows the user to specify all kinds of “contracts” about how a method will work — preconditions, post-conditions, invariants, etc — in a very rich way. If anybody breaks those contracts (e.g. by passing illegal arguments), it gets flagged. It sounds like it would be really tedious to generate all those promises, but the tool Daikon can help. Daikon can generate promises based on actual run data; if something changes to violate the promises, it will flag it. (The contracts also work as extra documentation.)
Permalink
02.01.08
Posted in programmer productivity at 11:31 am by ducky
Whoa dude!
I was looking for how to set a breakpoint on a field in Eclipse, i.e. any time that field gets set, break into the debugger. I stumbled into something else.
In the Variables pane of the debugger, if you right-click on a variable, one of the (many) options is All Instances; another is All References. They appear to show a list of all instances of a particular class and all the objects that refer to that instance. Oh wow! That would have been so useful!
And for my record, the way to set a breakpoint on a field modifications is to
- find the field declaration,
- set a breakpoint on that line,
- go to the Breakpoints pane,
- right-click on that breakpoint,
- select Breakpoint Properties, and
- uncheck the Field Access box
Permalink
08.28.07
Posted in Hacking, programmer productivity, Technology trends at 10:36 am by ducky
As I work, I find myself asking a question over and over “how do I get information from point A to point B?”. For example, “How do I get the position of the active editor from EditorSashContainer into TabBehaviourAutoPin?”
It seems like there is probably room for a tool that figures out a path from point A to point B.
This might also make the code more maintainable — instead of me going and making new methods in the chain from A to B through six intermediate methods, maybe the tool can find a path that goes through only two classes.
I can imagine the tool giving me the shortest path through public methods, and (if it exists) the shortest path that requires private variables or methods. Maybe it would find that there is a private variable that, if I made a public getter, I could use.
UPDATE: my supervisor (“advisor” in the US) pointed me at a paper describing a system (Prospector) to do just that. Alas, I haven’t been able to find any code for it. The code is at here, and there is also a web interface (which knows about J2SE 1.4, Eclipse 3.0, and Eclipse GEF source). The plugin also unfortunately seems to be too downrev for me. 🙁
There is also a tool called Strathcona which does something sort of similar — it finds examples of existing code that goes from A to B. I don’t think that would have helped me with the specific things I was looking for, because I don’t think there was any existing code anywhere that did all of what I wanted to do. It might have helped me get from A to C and then from C to A, however.
Permalink
« Previous Page — « Previous entries « Previous Page · Next Page » Next entries » — Next Page »