01.02.11

Prediction: cellphone cameras

Posted in Technology trends at 11:54 am by ducky

It is common to do retrospectives at the end of the calendar year, but I’m more interested in looking forward.  Here’s a prediction: ten years from now, it will be common, ordinary, and routine for people to use their cellphones’ camera to help them see.  I expect that people will use them as magnifying glasses (though probably only to about 10x or 20x zoom), telescopes, and night vision enhancers.

01.05.10

Predictions for 2020

Posted in Technology trends at 1:46 am by ducky

Oh what the heck, since everybody else is doing it, here are my predictions for 2020:

  1. Essentially all cell phones will have built-in video cameras, GPS, and have voice controls.
  2. At least one country will nationalize music in some way, e.g. paying the music companies a per capita fee for their citizens every year.  Some countries will strike copyright laws for music.  Most just won’t bother enforcing copyright laws for music.
  3. Improved search + improved geo-location of social media streams will mean that it will be far easier to get information about your micro-neighbourhood.  Think Google Trends or Google Flu or Twitter Trends, but for the five mile radius of where you are right now.  (And, because of #1, you can get video.)
  4. Know where newspapers are right now, on the brink of death?  That’s where TV will be in 2020 — squeezed between on-demand entertainment and crowd-generated news.
  5. The cancer five-year survival rate cure rate will be 90% for most cancers, and 40% for the most difficult ones (bone, brain, pancreas, and liver).  Treatment will, unfortunately, still majorly suck for most patients.
  6. Mapping will extend to reconstruction of scenes based on user photos (like what Microsoft demonstrated at TED) in a big way.  By 2020, 100% of San Francisco’s publicly accessible spaces (yes, including alleys) will be mapped, and about 35% of interior spaces.  People at first will be quite upset that the world can “see into” their living room, but they will end up getting used to it.
  7. Marriage for same-sex couples will be recognized by the U.S. government.
  8. Know where newspapers were five years ago, sort of moseying down the path of death?  That’s where universities will be in 2020.  They will face pressure as superb educational content will become a commodity.  Third-party organizations will jump into the mix to provide tutoring and certification, leaving non-research universities with little to offer aside from post-teen socialization and sports.
  9. 30% of the world electricity energy production will be solar in 2020.  (It’s going to be one hell of a race between climate change and solar energy production, but I think solar energy will win.  All the climate-change deniers will say, “See!  Toldja so!”
  10. Data format description languages will overthrow XML. mean that data will get passed around in compact formats instead of in XML.  (Yes, the DFDL might be in XML, but the data wouldn’t be.)

Okay, I admit it, #10 might just be wishful thinking.

Update: At the time I wrote this, I had not read up on the Google Nexus One phone, which I now find out has voice commands for just about everything.  I guess prediction #1 about voice was under-optimistic!

10.02.09

Terribly sorry..

Posted in Technology trends at 11:23 am by ducky

I’m really sorry, but I moved from http://webfoot.com/blog to blog.webfoot.com, and the users are (hopefully only temporarily) lost.  I’ll work on it, but it might be a little while.

Okay, I think users are back up.  Let me know.

05.20.09

Gender and programming

Posted in Hacking, programmer productivity, robobait, Technology trends at 11:44 am by ducky

I had a very brief but very interesting talk with Prof. Margaret Burnett.  She does research on gender and programming. at Oregon State University, but was in town for the International Conference on Software Engineering.  She said that many studies have shown that women are — in general — more risk averse than men are.  (I’ve also commented on this.)  She said that her research found that risk-averse people (most women and some men) are less likely to tinker, to explore, to try out novel features in both tools and languages when programming.

I extrapolate that this means that risk-seeking people (most men and some women) were more likely to have better command of tools, and this ties into something that I’ve been voicing frustration with for some time — there is no instruction on how to use tools in the CS curriculum — but I had never seen it as a gender-bias issue before.  I can see how a male universe would think there was no need to explain how to use tools because the figured that the guys would just figure it out on their own.  And the most guys might — but most of the women and some of the men might not figure out how to use tools on their own.

In particular, there is no instruction on how to use the debugger: not on what features are available, not on when you should use a debugger vs. not, and none on good debugging strategy.  (I’ve commented on that here.)  Some of using the debugger is art, true, but there are teachable strategies – practically algorithms — for how to use the debugger to achieve specific ends.   (For example, I wrote up how to use the debugger to localize the causes of hangs.)

Full of excitement from Prof. Burnett’s revelations, I went to dinner with a bunch of people connected to the research lab I did my MS research in.  All men, of course.  I related how Prof. Burnett said that women didn’t tinker, and how this obviously implied to me that CS departments should give some instruction on how to use tools.  The guys had a different response: “The departments should teach the women how to tinker.”

That was an unsatisfying response to me, but it took me a while to figure out why.  It suggests that the risk-averse pool doesn’t know how to tinker, while in my risk-averse model, it is not appropriate to tinker: one shouldn’t goof off fiddling with stuff that has a risk of not being useful when there is work to do!

(As a concrete example, it has been emotionally very difficult for me to write this blog post today.  I think it is important and worthwhile, but I have a little risk-averse agent in my head screaming, screaming at me that I shouldn’t be wasting my time on this: I should be applying for jobs, looking for an immigration lawyer, doing laundry, or working on improving the performance of my maps code.  In other words, writing this post is risky behaviour: it takes time for no immediate payoff, and only a low chance of a future payoff.  It might also be controversial enough that it upsets people.  Doing laundry, however, is a low-risk behaviour: I am guaranteed that it will make my life fractionally better.)

To change the risk-averse population’s behaviour, you would have to change their entire model of risk-reward.  I’m not sure that’s possible, but I also think that you shouldn’t want to change the attitude.  You want some people to be risk-seeking, as they are the ones who will get you the big wins.  However, they will also get you the big losses.  The risk-averse people are the ones who provide stability.

Also note that because there is such asymmetry in task completion time between above-median and below-median, you might expect that a bunch of median programmers are, in the aggregate, more productive than a group at both extremes.  (There are limits to how much faster you can get at completing a task, but there are no limits to how much slower you can get.)  It might be that risk aversion is a good thing!

There was a study I heard of second-hand (I wish I had a citation — anybody know?) that found that startups with a lot of women (I’m remembering 40%) had much MUCH higher survival rates than ones with lower proportions of women.  This makes perfect sense to me; a risk-averse population would rein in the potentially destructive tendencies of a risk-seeking population.

Thus I think it does make sense to provide academic training in how to use tools.  This should perhaps be coupled with some propaganda about how it is important to set aside some time in the future to get comfortable with tools.  (Perhaps it should be presented as risky to not spend time tinkering with tools!)

UPDATE: There’s an interesting (though all-too-brief!) article that mentions differences in the biochemical responses to risk that men and women produce.  It says that men produce adrenaline, which is fun.  Women produce acetylcholine, which the article says pretty much makes them want to vomit.  That could certainly change one’s reaction to risk..

05.04.09

Locally omniscient debugging

Posted in programmer productivity, Technology trends at 4:22 pm by ducky

Update: it turns out that lots of people have done exactly what I asked for: see Instruction-level Tracing:
Framework & Applications
and the OCaml debugger.  Cooool! (Thanks DanE!)

In my user studies, programmers used the debugger far less than I had expected.  Part of that could perhaps be due to poor training in how to use a debugger — it is rare to get good training in how to use a debugger.

However, I think the answer is simpler than that: it is just plain boring and tedious to use a debugger.  One guy did solve a thorny problem by stepping through the debugger, but he had to press “step over” or “step into” ninety times.

And when you are stepping, you must pay attention.  You can’t let your mind wander, or you will miss the event you are watching for.  I can’t be the only person who has done step, step, step, step, step, step, step, boom, “oh crap, where was I in the previous step?”

Omniscient debuggers are one way to make it less tedious.  Run the code until it goes boom, then back up.  Unfortunately, omniscient debuggers capture so much information that it becomes technically difficult to store/manage it all.

I suggest a compromise: store the last N contexts — enough to examine the state of variables back N levels, and to replay if desired.

I can imagine two different ways of doing this.  In the first, the user still has to press step step step; the debugger saves only the state changes between the lines that the user lands on.  In other words, if you step over the foo() method, the debugger only notes any state differences between entering and exiting the foo() method, not any state that is local to foo().  If the user steps into foo(), then it logs state changes inside foo().

In the other method, the user executes the program, and the debugger logs ALL the state changes (including in foo(), including calls to HashTable.add(), etc.).  This is probably easier on the user, but probably slower to execute and requires more storage.

You could also do something where you checkpoint the state every M steps.  Thus, if you get to the boom-spot and want to know where variable someVariable was set, but it didn’t change in the past N steps, you can

  • look at all your old checkpoints
  • see which two checkpoints someVariable changed between
  • rewind to the earlier of the two checkpoints
  • set a watchpoint on someVariable
  • run until the watchpoint.

04.01.09

Volatility of tech market

Posted in Technology trends at 3:13 pm by ducky

A while back, I hypothesized that women don’t go into computer science because it is a high-risk field.  Today I want to share some anecdotal evidence from my own experience about how risky high-tech is.  Since I turned 18, I have worked at many places, both as a contractor, summer intern, and regular full-time employee.  Of the seven companies where I have a regular job, five ran out of money, one got bought, and one no longer makes the product I worked on.  None of the companies I worked for were profitable when I left.

Regular jobs:

Contracts:

  • NexGen (2 years).  Bought by AMD.  (This counts as an eventual success, but it took them eight years, three product cycles, and I-don’t-know-how-many-rounds-of-funding before they released their first product.)
  • Cray Research (3 weeks).  Bought by SGI, then sold to Tara Computer, which took over the Cray name.
  • Data General (2 weeks).  Bought by EMC.
  • SGI (6 months).  Filed for bankruptcy today.
  • Apple (6 months).  While the company still exists, they no longer make printers, which is the product  I worked on.
  • Sun Microsystems (6 months).  In talks with IBM for IBM to buy Sun.  Update: IBM didn’t buy Sun, but Oracle did.
  • Triquest (1 month).  Gone.
  • Chicago Tribune (1 week).  Still in business, but I wouldn’t bet on it being in business next year.

Summer jobs:

  • EIT.  Bought by VeriFone, which was then bought by HP.  (This counts as a success!)
  • Google.  Still in business!

I realize that I’m probably on one long tail of the distribution, while my husband is on the other end.  (Jim worked for Adobe for 16 years, and Adobe is still in business!)  However, it shows that working in high-tech does have risks.

11.02.08

rebuttal to "Diseconomies of Scale"

Posted in review, Technology trends at 8:44 pm by ducky

Ken Church and James Hamilton have a blog posting of Diseconomies of Scale which doesn’t seem right to me.

They suggest that it is more expensive to run a data center than to buy a bunch of condos, put 48 servers in each, wall up the servers, rent out the condo, and use the servers as a distributed data farm.

Capital costs

Church and Hamilton figure that the capital cost of a set of condos is about half of the cost of a data center.  However, they use a sales price of $100K for the condos, but according to a news release issued by the US National Association of Realtors, the median metro-area condo price in the US was $226,800 in 2007 Q2.

Perhaps you could get cheaper condos by going outside the major metro areas, but you might not be able to get as good connectivity outside the major metro areas.  I don’t have a figure for the median condo price in cities of population between 50,000 and 100,000, but that would be an interesting number.

A rack of 48 servers draws 12kW (at 250W/server, his number).   For a house wired for 110V, this means that the current draw would be 109A.  That is approximately five times what a normal household outlet can carry, so you would need to do significant rewiring.  I don’t have an estimate on that cost.

Furthermore, my construction-industry source says that most houses can’t stand that much extra load.  He says that older houses generally have 75A fuseboxes; newer houses have 100A to them.  He says that it is becoming more common for houses to get 200A fuseboxes, but only for the larger houses.

While it is possible to upgrade, “ the cost of [upgrading from a 100A fusebox to a 200A fusebox] could be as high as $2000.”  I suspect that an upgrade of this size would also need approval from the condo board (and in Canada, where residential marijuana farms is somewhat common, suspicion from the local police department).

If, as Church and Hamilton suggest, you “wall in” the servers, then you will need to do some planning and construction to build the enclosure for the servers.  I can’t imagine that would cost less than $1000, and probably more like $2000.

They also don’t figure in the cost to pay someone (salary+plane+rental car+hotel) to evaluate the propterties, bid on them, and buy them.  I wouldn’t be surprised if this would add $10K to each property.  While there are also costs associated with acquiring property for one large data center, those costs should be lower than for aquiring a thousand condo.

Bottom line: I believe the capital costs would be higher than they expect.

Operating costs

For their comparison, Church and Hamilton put 54,000 servers either in one data center or in 1125 condos.  They calculate that the power cost for the data center option is $3.5M/year.  By contrast, they figure that the condo power consumption is $10.6M/year, but is offset by $800/month in condo rental income at 80% occupancy, for $8.1M in income.  I calculate that 800*1125*12*.8 is $8.6M, and will use that number, giving a net operating cost of $2M/year, or $1.5M/year less than a data center.

Given a model with 1125 condos, this works out to $111 per condo per month.   Put another way, if they are off by $111/month, a data center is cheaper.

Implicit in their model is that nobody will ever have to lay hands on a server rack.  If you have to send someone out to someone’s residence to reboot the rack, that will cost you.  That will depend upon how far the condo is from the technician’s normal location, how easy it is to find the condo, how agreeable the tenant is to having someone enter their condo, etc.  I believe this would be a number significantly larger than zero.  You might be able to put something in the lease about how the tenant agrees to reboot the servers on request, but this inconvenience will come at some price.  Either it will come in the form of reduced monthly rent or a per-incident payment to the tenant.

I believe that the $800/month rental income ($1000/month rent less $200/month condo dues) is overly optimistic.

The median rent for a metro-area apartment was $1324 in 2007, but that’s for all housing types, not just condos.  The median rent nationally in 2007 was $665.  If you want the metro income, you need to pay the metro purchase price of $227K; if you want to save money by going outside the metro areas, you won’t get the high rent.

Eyeballing a few cities on Craigslist, it looked to me like the metro rent for two-bedroom condos was about $800/month.

Church and Hamilton also didn’t account for

  • management fees (~5-7% of gross rent per month, or $50-$70 on income of $1000)
  • property taxes (~2%/year of purchase price, so $160/month if he finds condos for $100K)
  • maintenance that isn’t covered by the condo association like paint, new carpeting, and the hole the tenant punched in the wall ($40/month)
  • reduction in rent for part of the condo being walled off and/or inconvenience of rebooting the servers (~$50/month)
  • Liability insurance. If a short in the servers burns the condo complex down, that’s bad.
  • Internet connectivity

While they didn’t account for Internet connectivity in either the datacenter scenario or the condos scenario, this is an area wherethere seem to be large economies of scale. For example, a T3 (44.736 Mbit/s) seems to cost between $7,500 and $14K/month or between $167 and $312/Mbit/sec/month.  A T1 (1.536 Mbit/s) seems to cost between $550 and $1200/month or between $358 and $781/Mbit/s/month.  The T1 is thus about twice as expensive per byte as the T3.   I don’t know how much connectivity 54,000 servers would need, but I expect it would be significant, and expect that it would be significantly more expensive in 1125 smaller lots.

Non-quantifiable

I can imagine there some significant additional costs in time and/or money.

  • Obstreperous condo boards.  If the condo board passes a rule against having server farms in your unit, you’re screwed.
  • Obstreperous zoning boards.  If the city decides that the server farm is part of a business, they might get unhappy about it being in a building zoned residential.
  • Criminal tenants.  What’s to stop your tenants from busting into the server closet and stealing all the servers?

Church and Hamilton close their article by saying, “whenever we see a crazy idea even within a factor of two of what we are doing today, something is wrong”.  I think they are correct, and that their analysis is overly simplistic and optimistic.

07.29.08

geek cool alert: Triage

Posted in Hacking, programmer productivity, Technology trends, Uncategorized at 11:01 am by ducky

There’s a cool paper on a tool to do semi-automatic debugging: Triage: diagnosing production run failures at the user’s site. While Triage was designed to diagnose bugs at a customer site (where the software developers don’t have access to either the configuration or the data), I think a similar tool would be very valuable even for debugging in-house.

They use a number of different techniques to debug C++ code.

  • Checkpoint the code at a number of steps.
  • Attempt to reproduce the bug.  This tells whether it is deterministic or not.
  • Analyzes the memory by walking the heap and stack to find possible corruptions.
  • Roll back to previous checkpoints and rerun, looking for buffer overflows, dangling pointers, double frees, data races, semantic bugs, etc.
  • Fuzz the inputs: intentionally vary the inputs, thread scheduling, memory layouts, signal delivery, and even control flows and memory states to narrow the conditions that trigger the failure for easy reproduction
  • Compare the code paths from failing replays and non-failing replays to determine what code was involved in that failure.
  • Generate a report.  This gives information on the failure and a suggestion of which lines to look at to fix it.

They did a user study and found that programmers took 45% less time to debug when they used Triage than when they didn’t for “real” bugs, and 18% for “toy” bugs.  (“…although Triage still helped, the effect was not as large since the toy bugs are very simple and straightforward to diagnose even without Triage.”)

It looks like the subjects were given the Triage bug reports before they started work, so the time that it takes to run Triage wasn’t factored into the time it took.  The time it took Triage to run was significant (up to 64 min for one of the bugs), but presumably the Triage run would be done in background.  I could set up Triage to run while I went to lunch, for example.

This looks cool.

07.21.08

tabbing behaviour

Posted in programmer productivity, Technology trends at 5:09 pm by ducky

I did a very quick, informal survey on how people use tabs when looking at Web search results. Some people immediately open all the search results that look interesting in new tabs, then explore them one by one (“open-parallel). Others open one result in a new tab, explore it, go back to the search page, then open the second result in another tab, etc. (“open-sequentially”). Note that the “open-sequential” people can have lots of tabs open at a time, they just open them one by one.

To clarify, open-parallel means control-clicking on the URL for result #1, then on result #2, then on #3, then on #4, and only THEN and switching to the tab for #1, examining it, switching to the tab for #2, etc.  Open-sequential means control-clicking on the URL for result #1, switching to the tab for #1, examining #1, switching to the search results page, control-clicking on #2, switching to the tab for #2, examining #2, switching to the search results page, etc.

I was surprised to find that the people who had been in the US in the early 2000’s were far more likely to use the open-parallel strategy. There was an even stronger correlation with geekdom: all of the geeks used the open-parallel, and only two of the non-geeks did.

Open-parallel Open-sequential
Citizenship Where in early 00’s? Geek? Citizenship Where in early 00’s? Geek?
US Working/studying in US Y Canada Working/studying in Canada N
US Working in US Y US Studying in US N
US/Canada Studying in US Y Canada Studying in Canada N
Canada Studying/working in US Y Canada Studying in US N
Australia Working in Australia(?) Y Netherlands Working in Europe(?) N
US/Canada Working in US Y Canada University in Canada N
Canada Working in Canada sort-of Canada University in Canada N
US University in US N
India University in US N

Notes on the survey:

  1. The subject pool is not representative of the general propulation: everyone who answered lives or lived at my former dorm at UBC, has a bachelor’s degree, and all but one have an advanced degree or are working on one.
  2. I classified people as geeks if they had had Linux on at least one of their computers and/or had worked in the IT industry. The person with the “sort-of” in the geek column doesn’t qualify on any of those counts, but was a minor Internet celebrity in the mid 90s.

What does this mean?

What does this mean? I’m not sure, but I have a few ideas:

  • I suspect that the geeks are more likely to have used a browser with modern tabbing behaviour much earlier, so have had more years to adapt their strategies. (Internet Explorer got modern tabbing behaviour in 2006; Mozilla/Firefox got it in 2001.)
  • One of the benefits of the open-parallel strategy is that pages can load in the background. Maybe in 2001, Web access was slower enough that this was important and relevant. Maybe it’s not that the geeks have been using tabs longer, but that they started using tabs when the Internet was slow.
  • It might be that the geeks do more Web surfing than the non-geeks, so have spent more time refining their Internet behaviour.

04.16.08

help? tab spam in different IDEs?

Posted in Eclipse, Hacking, Technology trends at 5:42 pm by ducky

There are a zillion graphical IDEs out there, and I really don’t want to download and try each one. I don’t even want to try twenty of them. So, dear readers, care to help me?

All the IDEs that I’ve seen have a main center panel with the source of the file you’re looking at. Above that, they seem to all have a row of tabs, one per file that you’ve opened. (Does anybody have anything different?)

Here is a link to a screenshot of Eclipse. (Sorry, it was too wide for my blog.) Eclipse puts a little “>>” to show that there are more files (as pointed to by the ugly black arrow), with the number of hidden tabs next to it (“4” in this case). Clicking on the “>>4” gives a drop-down menu (shown here in yellow).

What happens in other IDEs when you have more tabs than will fit across the horizontal width of the source file? How does your IDE handle it? Or does your IDE have a different tabbing model altogether, e.g. it opens one window per file?  I would greatly appreciate hearing from you.

You can either email me ducky at webfoot dot com or post in the comments; I’ll wait a few days and then summarize in another blog posting.  Remember to tell me the name of your IDE.

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »