I read yet another paper by Andrew Ko, this one titled Information Needs in Collocated Software Development Teams and co-authored by Gina Venolia and Robert DeLine.
They collected a bunch of data by shadowing developers of various sorts and got about 25 hrs of data out of it. Mostly they were interested in what kinds of information people need and use, but as a side effect they also logged how much time was spent on what type of activity.
I was curious about how much time people spent on what activity, but they didn’t publish the breakdowns. I can completely understand that — the classification was probably subjective, the sample might have been skewed, blah blah blah. It wouldn’t have had much academic validity.
Still, it is interesting from a non-academic standpoint. It’s another piece of information that helps me create my model of the world. Thus I eyeballed times from a chart in the paper, and this my very imprecise view of how that group of non-randomly-selected developers spent their time:
- 19% – understanding execution behavior (reading code, using the debugger, asking co-workers)
- 18% – writing code
- 15% – reproducing a failure (reading bug reports, setting up test machines, running code)
- 13% – triaging bugs (thinking, talking with other developers)
- 11% – reasoning about design e.g. what is this code supposed to do? (thinking, asking others)
- 10% – non-work
- 8% – maintaining awareness (reading bug reports, reading submission reports, reading email)
- 6% – submitting a change (making sure submission was correct, diffing, running unit tests, using debuggers)
Note that they didn’t define what “non-work” meant. Did writing docs count as work? Did helping marketing out count as work? Did helping a teammate count?
While the numbers seem reasonable if I think about it, if you had asked me how much time was spent on submissions and on triaging bugs, I would have given much smaller numbers. That was surprising to me.
These numbers also show that what one learns in school — how to write a piece of code from scratch — is only a very small portion of what one spends time on in The Real World. While I hear a lot of angst from educators about “communication skills”, I have never seen a class on how to write a good bug report, or how to write a good submission report. I wouldn’t have ever heard of classes on how to write good email messages if I didn’t happen to be a recognized authority on that.
I also haven’t seen much training on how to use a debugger, how to reproduce bugs, or how to triage bugs. Maybe there isn’t much you can teach people on reproducing bugs or triaging them, but there certainly is a lot you can teach people about how to use a debugger.
NB: I was surprised to see that there were nine phone calls in the 25 hours of observation, three of which were work-related. I didn’t know anybody still made phone calls in 2006! I probably got about two work-related phone calls per year in the past four years, and my cell phone log shows that I only get about ten phone calls total per month. The bulk of my communication is email, IM, and SMS.
I recently watched some people debugging, and it seemed like a much harder task than it should have been. It seems like inside an IDE, you ought to be able to click START LOGGING, fiddle some with the program of interest, click STOP LOGGING, and have it capture information about how many times each method was hit.
Then, to communicate that information to the programmer, change the presentation of the source code. If a routine was never executed, don’t show it in the source (or color it white). If none of a class’ methods or fields were executed, don’t show that class/file. If a method was called over and over again, make it black. If it was hit once, make it grey.
It doesn’t need to be color. You could make frequent classes/methods bigger and infrequent ones smaller. If the classes/methods that were never changed were just not displayed — not visible in the Package Explorer in Eclipse, for example — that would be a big help.
Maps are quirky things. The Web is amazing.
I was looking at Canadian province boundaries for a hobby project of mine, and found a strange divot in the border between Nunavut and Northwest Territories. Google maps shows the border running happily due east along the 70th parallel, then suddenly dropping down about seven miles, going over about five miles, back up to the 70th parallel, and continuing east as if nothing had happened.
I was stumped as to what it might be. I looked at the area at the highest zoom, and I could see absolutely nothing interesting there: no roads, no lakes, no nothing.. I speculated that maybe there was a Minister of Parliment who had a summer cottage there, because nothing else made sense.
I asked the geography types here, and one asked a friend, who posted it on MetaFiler, and I got an answer! The short answer is that there was a prior land claim that the government just didn’t want to open up again.
It turns out that the reason that I couldn’t see anything interesting on the sat images was because Google had the divot in the wrong place. It’s supposed to be over to the right a bit. I drew a line on the map over where the Nunavut Act says it is supposed to be. There is in fact a lake there.
Now think about this. Fifteen years ago, I would have never had access to a map that showed that level of detail. (Or maybe I could get access, but it would take enough effort that it wouldn’t be something I did casually.) Fifteen years ago, if I wondered about the divot, I wouldn’t have even known how to start finding out what that divot was. And, if I found an error in the map, I wouldn’t have known who to contact about fixing the map in the next printing. Now, thanks to the Web, I can do all those things.
It’s been thirteen years since I discovered the Web, and I still think it is pretty amazing.
I spent all day yesterday working on configuration, and I’m probably going to do the same again today.
I am learning PHP with a hobby project, and my research group is heavily involved in Eclipse. (Mylar came out of my group, for example.) I thus wanted to try out the PHP plug-in for Eclipse. Simple, right?
That meant that I needed PHP4 and MySQL4. (I wanted PHP4 because that’s what my ISP has and what I’d started using. At one point, I upgraded on webfoot.com to PHP5, and everything broke. I didn’t want to try to troubleshoot that while I’m still learning PHP, so I just reverted to PHP4 on webfoot.com, and wanted to match that locally. I wanted MySQL4 because that’s what happened to be installed on webfoot.com for WordPress.)
Addendum: I have since learned of xampp, which has Apache/MySQL/PHP5 already configured together, and MAMP, which is similar (but only for MacOS). I can see why that would be useful. I’m tempted to blow away my entire installation and just download xampp. *Sigh*
Downloading PHP4 was simple, since I use Ubuntu. I just used Synaptic, a nice GUI front-end to apt-get, to grab both php4 and php4-cli. The documentation seemed to say that there were only two lines that I really needed to add:
AddType application/x-httpd-php .php .php3 .php4 .phtml
AddType application/x-httpd-php-source .phps
plus I needed to verify that index.php was already in my DirectoryIndex. No problem. I edited /etc/apache2 and off I went. Uh, no go. I searched and searched and searched… and finally noticed that I was editing apache2, but that I also had apache1.3 installed, and apache1.3 was running, not apache2. Editing the config file for apache2 did me no good at getting PHP4 working with apache1.3.
I thought about it for a bit, and decided that it really was time to fully upgrade to apache2. I had stalled on it because twiki used to have some issues with apache2, but those seem resolved. So now I had to configure apache2.
Fortunately, I’ve been configuring httpd and its children since 1994, so that went pretty quickly.
I then turned my focus to MySQL. I used Synaptics to get the mysql-client-4.1 and mysql-server-4.1 packages. I was a good little girl and tried to follow along in the official docs. Unfortunately, I didn’t notice that the first doc I started reading was the MySQL5 manual, not the MySQL4 manual. (That didn’t take too long to figure out.) Next, I got very confused and worried because the manual talked about what the file layout looked like, and mine looked very different. I finally decided that I shouldn’t worry too much, because the Ubuntu way of doing things is sometimes different. However, when it started saying that I could check the installation by running the scripts in run-bench, then I got really perplexed.
My slocate db didn’t have all the files that I just downloaded, so I decided to take a diversion and figure out how to update it. It took a little while to figure it out and do the update, but it was straightforward.
The magic incantation is:
Back to MySQL
Unfortunately, after updating the db, run-bench was still nowhere to be found. Somewhat troubled by this, I decided to continue.
I was pretty much dead in the water, however. When I had run mysql_install_db, it had printed out some instructions about setting a password, which I, like a good little girl, dutifully followed. However, the official docs assumed that I had NOT set a password, and so I was unable to follow. They also didn’t tell me how to specify a password (since I didn’t need one since they hadn’t told me to set one), and so what I had been trying was wrong. (Hint: use the -p option and it will prompt you for a password.)
I finally blew away the data directory, then ran mysql_install_db again, and everything worked much better. I could now run mysql and manipulate the databases directly. Yay! However, PHP was still not able to connect to MySQL. Boo!
I did some sleuthing and found some very rude but helpful forum postings that led me to believe that I needed to put mysql.so in the PHP extensions directory. (To find the PHP extensions directory, run phpinfo() in a PHP script, then examine the output for the extensions directory.) Fortunately, slocate found two versions of mysql.so. Unfortunately, they were in a Perl subdirectory and a Python subdirectory — probably not what I wanted. On a whim, I tried making a symbolic link from one of those to the PHP extensions directory. I was just as successful as I expected I’d be — namely not at all.
I took a break, searched the Web, took another break, had some chocolate, and then thought to look through the packages on Synaptic again. There was one called php4-mysql. Bingo! I downloaded it, and MySQL and PHP were now talking to each other.
Off to download the PHP Eclipse plugin. Fortunately, all I had to do was go to Help->Software Updates->Find and Install->new->New Remote Site and enter the update URL, http://phpeclipse.sourceforge.net/update/releases. Unfortunately, it wanted Eclipse 3.1. I was hoping that it meant 3.1 or higher, but no, I found out that they really meant Eclipse 3.1, so I had to download 3.1 and try again.
The 3.1 version of Eclipse is slightly hidden because it is so old, but I was able to find it pretty quickly and download it. I have been downloading and configuring Eclipse a lot lately because of my research lab’s involvement with it, so it was painless for me.
There is a nice step-by-step guide to installing PHPEclipse that informed me that I would need DBG, the PHP Debugger.
I downloaded the tarball for the dbg modules,then copied the right version (4.4.2) over to the PHP extensions directory.
I did a bunch of configuring in PHPEclipse, but got stuck on a configuration page that insisted on knowing my PHP interpreter. I didn’t seem to have one.
Fortunately, I do sometimes learn from my tribulations, and thought to look at Synaptic again. Lo and behold, there was a package php4-cli. I downloaded that, and PHPEclipse running PHP scripts! Hallelujah!
Unfortunately, I couldn’t figure out how to pass the QUERY_STRING to the scripts I was running under Eclipse. I couldn’t figure out how to get them into $_GET, $_POST, or $_REQUEST, try as I might. Eventually, I did a hacky workaround: I wrote a function that first tried $_REQUEST, and if that didn’t work, pulled the variables out of $_ENV. To get them into $_ENV, I set them (one varible and value per line) in Run->Run…->(configuration)->Environment. Note that I had to have the radio button set to “Append environment to native environment”.
So now PHP, MySQL, Apache, PHPEclipse, and DBG are all working together. They might not be in perfect harmony, but they are all at least singing the same song. Yay!
Addendum: not done, back to PHP4
Hah. I thought I was done, but it turns out that the PHP4 that I downloaded via Synaptic didn’t have curl compiled in, which I needed for one of my scripts. Downloading php4-curl made everything happy.
Why is this so hard?
There is a lot of documentation out on the Web on how to configure stuff, but a lot of it is for Windows and the Linux docs tend to assume you are downloading tarballs or RPMs. For example, the alert reader might still be wondering where run-bench got to. The docs finally told me (about three pages later, arg!) that you’d only see run-bench if you downloaded the source, that run-bench wasn’t in the binaries. Arg!
Part of the reason that this was hard is that PHP, MySQL, and Apache are each very complex packages that can be used in several different modes. All three of them can be run entirely independently. PHP can be run with or without curl as well, and probably sixteen other modes that I don’t know about.
Apache, MySQL, and Eclipse are complex enough that they have a fair amount of configuration that they need even if they are being run stand-alone.
More fundamentally, documentation isn’t sexy. 🙁 People don’t write a whole lot of documentation; a lot of my troubleshooting was aided not by formal docs but by blog postings and forum discussions. (That’s why I’m posting this — so the next poor schmoe doesn’t have to spend as long as I did.)
Finally, it’s hard enough to get one project documented well, and I was trying to make FIVE different open source software packages play together. It was actually surprising that I didn’t get caught by holes around the edges more often.