03.20.11
Posted in Email, Technology trends at 12:09 am by admin
Note: I first wrote this in 2002 (revised 2004, 2006, and 2007) on the Web site for my books, but have since taken down that site. I was thinking about it today, so decided to repost it:
The Perfect Email Program
People occasionally ask me what I’d like to see in the perfect email program. Some email programs have some of the elements of a perfect email program, but none has all of them. Here’s my wish list:
- Virus resistance. While virus resistance is a broad and general topic, I would like, at a minimum, a filter condition that can examine the names of attachments,
e.g.
if .exe is in attachment name
- Easy way to see all “to-do” messages nicely grouped and prioritized.
The Conventional Wisdom is that you group messages by moving related messages into a folder. For example, move all messages from your manager into your “Boss” folder. Unfortunately, many (if not most) people have a hard time keeping track of their “to-do” messages (to-read, to-reply-to, to-act-on) when they are spread across multiple folders.
It’s better if you can sort them in place, in the inbox. Ideally, you’d like the inbox to show e.g. all the messages from your spouse at the top of the inbox, followed by all messages from your boss, followed by all messages from your coworkers, etc.
I don’t care what the mechanism is for grouping, as long as the “to-do” messages are visible in one place. For example, if I can set up a view that shows all the “to-do” messages in all folders at once (sometimes called Virtual Folders), grouped by what folder they’re in, fine. I do want to be able to expand/collapse the folders, however, so that I only see what is relevant to my tasks RIGHT NOW.
Another way the Perfect Email Program could do it is to let me use filters to change a field in the message that I can sort your inbox by. So for example, if the filters can change the “category” of a message,
and then I can sort the inbox by “category”, I’m happy.
NB: The filters in Eudora and Thunderbird can change the Label of a message; Outlook’s rules can
change the Category of a message. However, it’s a bit awkward to deal with them.
- Eudora has a very limited number of labels, 15 under Mac OS and 7 under Windows. Eudora doesn’t allow grouping (i.e. being able to collapse messages in a group), but it does allow sorting first by label, second by date.
- To sort an Outlook mailbox by Categories, you have to set up a View that Groups by Categories. Furthermore, if you reply to a message that you’ve assigned a Category to, when you reply, the receiver will see your Category…. and there is no way to strip Categories from incoming or outgoing messages (unless you set up a macro).
- Thunderbird 1.5 has a very limited number of labels, although Thunderbird 2.0 is supposed to allow an
arbitrary number of labels. Thunderbird 1.5 has grouping in various ways, though it doesn’t seem possible to group by address book. It does allow sorting first by label, second by date.
- Grouping by social network. I could have put this in the group-and-prioritize-in-place item above, because grouping-by-social-network works well with the above,
but you don’t have to have grouping-by-social-network for group-and-prioritize-in-place to be useful. I want my email client to be able to group messages by which social network the sender is in. I want to see messages from my co-workers in one bucket, messages from my family in another, etc, as noted above.
While yes, there are some cases where someone will be in two social networks (like if you work with your spouse), those are rare and can be handled by showing messages from people in two social networks twice, once for each social network.
It has been my experience that it is very difficult even for humans to figure out how to categorize email messages by anything else but sender; I don’t think a computer will ever be good enough at it.
However, there is one and only one sender for a message, and social groups are reasonably stable (in the sense that Rosario generally doesn’t leave your church group on Monday, join your skydiving group on Tuesday, leave your skydiving group on Wednesday, join your company on Friday, etc.).
I think computers probably can make good guesses at who is in which social group by looking at
your email history: who did you correspond with and who did your correspondent correspond with? (I do still want to be able to correct the email program’s choices.)
NB: IBM and Microsoft have both done some research on merging social networks with email. I don’t think they are quite to the “group by social network” feature yet, but they are getting close.
Even if the email program can’t guess at your social networks, you can still do the grouping by social network by hand. These two features make it much easier to do so:
- A filter condition that will check if someone is in a certain address book. This allows filters along the lines of “If the sender is in my ‘Friends’ address book, change the category to ‘Friends’”. (This is much easier than generating a different rule for each friend!)Thunderbird 1.5, Eudora 6, and Outlook 2000 all have the filter condition “is in address book X”. Thunderbird 1.5 doesn’t seem to have a way to filter for “is in any of my address books”.
- One-click/one-keystroke addition of the sender of a message to a particular address book. For example, when I get a message from my cousin for the first time, I should be able to easily add her to my “Family” address book — and so from then on, her messages should show up with the “Family” category.NB: Gmail has a click->dropdown-select to add to the address book. Thunderbird 1.5 takes click->dropdown-select->move-mouse-a-long-way->press-OK to put the person into the default address book.
- An easy way to mark messages “done”. To be able to see at a glance all the messages which I need to read, reply to, or act upon, I need to be able to get messages of my sight — to mark them “done” — when I no longer need to read, reply, or act upon them.My favorite way is to have a button in the toolbar that transfers finished messages out of the inbox and into a mailbox that has the same name as the message’s category. This should also be a one-keystroke operation.NB: Google’s GMail does this with their “Archive” button.Thunderbird has a bug for keyboard shortcut for filing a message to a folder, and one for specifying a default folder to file messages into. If these are implemented, it will probably be adequate. Unfortunately, there doesn’t seem to be any action on those two.
- An easy way to hide a message until some time in the future.Sometimes you know that you can’t deal with a message right now. For example, if Chantelle is the only one who knows what the status of the patent application is, and she won’t be back from her vacation in Bhutan until next Wednesday, you’d like to make Bob’s message about the patent to disappear for now, then reappear next Wednesday.NB: I like to call this the “hide-until” feature instead of the “defer” feature because I think “hide-until” makes it more obvious and explicit that the message is going to come back.
- A button in the toolbar for “move to next message” and “move to previous message”. Many programs let you use keyboard shortcuts (frequently arrow keys), but most of the people I’ve observed use the mouse for navigating, not the keyboard.NB: Eudora for Windows has toolbar buttons for next/previous message. You can set it up with Eudora for Macintosh but it’s a little clunky. Outlook has this for messages open in their own windows but not in the main list-of-messages window.Eudora has a keyboard shortcut for next/previous message. Outlook has a shortcut, but it’s different depending upon whether the index (list-of-messages) pane or the message pane is active. Thunderbird lets you use the up/down arrows, but in Threaded mode, you have to switch between
left/right to view previous-/next-in-thread or up/down for previous/next thread.
- An easy way to visually indicate who the message was addressed to:
- TO me and only me
- TO me and other people
- CC me only
- CC me and other people
- BCC me (me not mentioned)
Ideally, I’d like to be able to set different colors for messages depending upon how they
are addressed.
NB: Outlook lets you color code pretty easily.
Google’s GMail shows different icons based on how you were addressed.
If Thunderbird 2.0 allows you to group based on sender’s address book, then you could use labels to color code. (This does seem like a waste of the labelling capabilities, though.)
- Auto-suggest. If you are working in anything resembling tech support, you might have lots and lots of canned responses to common questions. Finding the right response might be tricky if you have lots to choose from. It would be nice to have filters able to suggest (with
checkboxes or some such) probable responses, with the option to either send-as-is or edit. For example, a college webmaster could have a filter
“If the word ‘admissions’ is in the subject line, suggest the ‘graduate admissions’ response and the ‘undergraduate admissions’ response.”
I got to use a client with autosuggest (and had just such a filter) when I was webmaster at a major university. It was amazing. Auto-suggest can get you through email about ten times faster.
NB: This is Thunderbird bug 151925, the somewhat less useful but still valuable reply-with-template is bug 21210.
.
- A way to concatenate message conversations with the redundant quoted material stripped out. I think the way that Zest does is very interesting.NB:
Eudora 6 does message concatenation and strips quotes if and only if you use preview mode. Gmail hides quoted material in a thread. Apple’s Mail.app and Thunderbird both pull messages in a thread to be next to each other, but don’t concatenate the messages.
- Automatic whitelisting. I want my email program to be able to recognize people I know: who are in my address book, who I have sent messages to, and who I’ve gotten mail from that I didn’t mark as junk. While these people should not get a “free pass”, since viruses now frequently forge addresses from people I know, I do want my spam filter to be more lenient for people I know.To make the automatic whitelisting useful, I’d like a filter condition
sender is someone I know
NB: Thunderbird 1.5 has a filter option “is in address book X”, where one of the options is “Collected Addresses”, but Thunderbird doesn’t actually seem to collect addresses for me.
- Filter actions that operate on attachments. I’d like to be able to move all attachments from people I don’t know into a “probable junk” folder.
- Filters that can score. With pass/fail filter conditions, it’s difficult to write good spam filters. Usually, messages with embedded images are spam — but not always. Usually, messages that don’t have me in the To or CC lines is spam — but not always.I want a filter action that will let me add/subtract points from a spam score, e.g.:
- add 100 points if the sender is someone I know
- subtract 50 points if the subject line contains Viagra
- subtract 80 points if the subject line starts with ADV
- subtract 40 points if the body contains 1-800
- subtract 1000 points if the body contains iframe src=cid:
and so on.
I want a filter condition that will check to see if the spam score is greater than/less than a value, so that I can do things like:
- if score<-100, delete message
- if score<15, assign to z-PossibleSpam category
Note that this is even more powerful if there is filter-by-filter import/ export: people could share good spam rules, they could be posted on Web sites.
(I am leery of having spam rules should hardwired into popular email programs — doing so gives the spammers a homogenous victim population that’s easy to target.)
I did some fiddling with a Visual Basic macro that does scoring (for Microsoft Outlook), and it was pretty deadly. Spambayes, which came along later, is also quite good, but is not particularly good at using information about who you know. SpamAssassin also does scoring and works pretty well.
NB: For Thunderbird, this is bug 151622. Note that if the built-in spam filters work well enough, this won’t be necessary.
- Easy importing of filters on a filter-by-filter basis.This would let people share the most effective filters.This sounds simple, but I think it is critically important. If spam filters are centrally distributed in some way — like if Microsoft builds them into Outlook, say — then the spammers will learn how to work around them. If everyone’s email filters are different, it will be much harder for the spammers to figure out how to work around them.NB: This is Thunderbird
bug 151612.
- Connection to a collaborative URL filtering service. (This one is a little tougher, as I haven’t heard of a collaborative URL filter service yet.) At the 2004 spam conference, somebody made a casual comment that 95% of spam has a URL in it. This is not surprising, as the spammers have to have their customers contact them somehow.While you can’t just penalize all messages that have URLs in them, you could build up a database of URLs seen in spam. This wouldn’t help the first person who saw a particular URL, but it would help the second, third, and thirty-millionth.Presumably, the spammers would start to use unique URLs for each person, but that’s a bit more expensive. (Expensive is good. If it gets too expensive, the spammers can’t make money any more.) Furthermore, the scoring system could penalize URLs from spammy domains, even if it isn’t an exact match.
I want many other things, but these are the biggies. If you want to hear about all the other things I want in an email program, contact me.
Permalink
03.19.11
Posted in Email, Random thoughts, Technology trends at 11:47 pm by admin
Today the AP decided to change its style guide to drop the use of a hyphen in “e-mail”. I feel vindicated.
When I was writing my books, lo those many years ago, I bucked the prevailing style guides and left the hyphen out. The hyphen in “e-mail” just looked wrong to me. “Besides”, I said, “there aren’t any other words that use the pattern ‘<letter>-hyphen-<word>’”.
Well, I proved myself wrong shortly after that:
A is the A-list of who’s the “in crowd”,
B is for B-school to make Mamma proud.
C is for C-note (the gangster’s small change),
While D’s for D-day which cut Adolf’s range.
E is for E-mail, an electronic note,
F is for F-word (that daren’t be spoke).
G is for G-string that dancers must wear,
and H’s for H-bomb to fight the Red Scare.
I is for I-beam to make a strong fort,
and J’s for J-school to learn to report.
K is for K-9, the cop that goes woof,
while L’s for L-bracket (to hold on your roof).
M is for M-dash (the one that is long),
with N for N-dash (all over this song).
O is for O-ring of Space Shuttle tears,
Q is the Q-tips you stick in your ears.
R is for R-value home insulations,
S is for S-set used in German nations.
T is for T-shirt that Americans wear,
and U’s for the U-joint of auto repair.
V is for V-neck which looks rather dressy,
X is for X-ray which acts to undress ye.
Y is for none else but Y-chromosome,
and if I knew Z I could maybe go home.
But you probably noticed I slipped past a few
I left out the P and W.
M-dash and N-dash are sort of a cheat,
But say what you will, they do keep the beat.
But if you know how to make this song better,
Send me a rhyme for your favorite letter!
Other people pointed out F-4, K-12, K-car, K-mart, N-word, O-levels, P-Funk, P-Furs, P-channel and n-channel, T-ball, T-square, U-boat, V-day, W-2, X- and Y-chromosome, and Z-buffering.
Permalink
01.02.11
Posted in Technology trends at 11:54 am by admin
It is common to do retrospectives at the end of the calendar year, but I’m more interested in looking forward. Here’s a prediction: ten years from now, it will be common, ordinary, and routine for people to use their cellphones’ camera to help them see. I expect that people will use them as magnifying glasses (though probably only to about 10x or 20x zoom), telescopes, and night vision enhancers.
Permalink
01.05.10
Posted in Technology trends at 1:46 am by admin
Oh what the heck, since everybody else is doing it, here are my predictions for 2020:
- Essentially all cell phones will have built-in video cameras, GPS, and have voice controls.
- At least one country will nationalize music in some way, e.g. paying the music companies a per capita fee for their citizens every year. Some countries will strike copyright laws for music. Most just won’t bother enforcing copyright laws for music.
- Improved search + improved geo-location of social media streams will mean that it will be far easier to get information about your micro-neighbourhood. Think Google Trends or Google Flu or Twitter Trends, but for the five mile radius of where you are right now. (And, because of #1, you can get video.)
- Know where newspapers are right now, on the brink of death? That’s where TV will be in 2020 — squeezed between on-demand entertainment and crowd-generated news.
- The cancer five-year survival rate cure rate will be 90% for most cancers, and 40% for the most difficult ones (bone, brain, pancreas, and liver). Treatment will, unfortunately, still majorly suck for most patients.
- Mapping will extend to reconstruction of scenes based on user photos (like what Microsoft demonstrated at TED) in a big way. By 2020, 100% of San Francisco’s publicly accessible spaces (yes, including alleys) will be mapped, and about 35% of interior spaces. People at first will be quite upset that the world can “see into” their living room, but they will end up getting used to it.
- Marriage for same-sex couples will be recognized by the U.S. government.
- Know where newspapers were five years ago, sort of moseying down the path of death? That’s where universities will be in 2020. They will face pressure as superb educational content will become a commodity. Third-party organizations will jump into the mix to provide tutoring and certification, leaving non-research universities with little to offer aside from post-teen socialization and sports.
- 30% of the world electricity energy production will be solar in 2020. (It’s going to be one hell of a race between climate change and solar energy production, but I think solar energy will win. All the climate-change deniers will say, “See! Toldja so!”
- Data format description languages will overthrow XML. mean that data will get passed around in compact formats instead of in XML. (Yes, the DFDL might be in XML, but the data wouldn’t be.)
Okay, I admit it, #10 might just be wishful thinking.
Update: At the time I wrote this, I had not read up on the Google Nexus One phone, which I now find out has voice commands for just about everything. I guess prediction #1 about voice was under-optimistic!
Permalink
10.02.09
Posted in Technology trends at 11:23 am by admin
I’m really sorry, but I moved from http://webfoot.com/blog to blog.webfoot.com, and the users are (hopefully only temporarily) lost. I’ll work on it, but it might be a little while.
Okay, I think users are back up. Let me know.
Permalink
05.20.09
Posted in Hacking, programmer productivity, robobait, Technology trends at 11:44 am by admin
I had a very brief but very interesting talk with Prof. Margaret Burnett. She does research on gender and programming. at Oregon State University, but was in town for the International Conference on Software Engineering. She said that many studies have shown that women are — in general — more risk averse than men are. (I’ve also commented on this.) She said that her research found that risk-averse people (most women and some men) are less likely to tinker, to explore, to try out novel features in both tools and languages when programming.
I extrapolate that this means that risk-seeking people (most men and some women) were more likely to have better command of tools, and this ties into something that I’ve been voicing frustration with for some time — there is no instruction on how to use tools in the CS curriculum — but I had never seen it as a gender-bias issue before. I can see how a male universe would think there was no need to explain how to use tools because the figured that the guys would just figure it out on their own. And the most guys might — but most of the women and some of the men might not figure out how to use tools on their own.
In particular, there is no instruction on how to use the debugger: not on what features are available, not on when you should use a debugger vs. not, and none on good debugging strategy. (I’ve commented on that here.) Some of using the debugger is art, true, but there are teachable strategies – practically algorithms — for how to use the debugger to achieve specific ends. (For example, I wrote up how to use the debugger to localize the causes of hangs.)
Full of excitement from Prof. Burnett’s revelations, I went to dinner with a bunch of people connected to the research lab I did my MS research in. All men, of course. I related how Prof. Burnett said that women didn’t tinker, and how this obviously implied to me that CS departments should give some instruction on how to use tools. The guys had a different response: “The departments should teach the women how to tinker.”
That was an unsatisfying response to me, but it took me a while to figure out why. It suggests that the risk-averse pool doesn’t know how to tinker, while in my risk-averse model, it is not appropriate to tinker: one shouldn’t goof off fiddling with stuff that has a risk of not being useful when there is work to do!
(It has been emotionally very difficult for me to write this blog post today. I think it is important and worthwhile, but I have a little risk-averse agent in my head screaming, screaming at me that I shouldn’t be wasting my time on this: I should be applying for jobs, looking for an immigration lawyer, doing laundry, or working on improving the performance of my maps code. In other words, writing this post is risky behaviour: it takes time for no immediate payoff, and only a low chance of a future payoff. It might also be controversial enough that it upsets people. Doing laundry, however, is a low-risk behaviour: I am guaranteed that it will make my life fractionally better.)
To change the risk-averse population’s behaviour, you would have to change their entire model of risk-reward. I’m not sure that’s possible, but I also think that you shouldn’t want to change the attitude. You want some people to be risk-seeking, as they are the ones who will get you the big wins. However, they will also get you the big losses. The risk-averse people are the ones who provide stability.
Also note that because there is such asymmetry in task completion time between above-median and below-median, you might expect that a bunch of median programmers are, in the aggregate, more productive than a group at both extremes. (There are limits to how much faster you can get at completing a task, but there are no limits to how much slower you can get.) It might be that risk aversion is a good thing!
There was a study I heard of second-hand (I wish I had a citation — anybody know?) that found that startups with a lot of women (I’m remembering 40%) had much MUCH higher survival rates than ones with lower proportions of women. This makes perfect sense to me; a risk-averse population would rein in the potentially destructive tendencies of a risk-seeking population.
Thus I think it does make sense to provide academic training in how to use tools. This should perhaps be coupled with some propaganda about how it is important to set aside some time in the future to get comfortable with tools. (Perhaps it should be presented as risky to not spend time tinkering with tools!)
UPDATE: There’s an interesting (though all-too-brief!) article that mentions differences in the biochemical responses to risk that men and women produce. It says that men produce adrenaline, which fun. Women produce acetylcholine, which the article says pretty much makes them want to vomit. That could certainly change one’s reaction to risk..
Permalink
05.04.09
Posted in programmer productivity, Technology trends at 4:22 pm by admin
Update: it turns out that lots of people have done exactly what I asked for: see Instruction-level Tracing:
Framework & Applications and the OCaml debugger. Cooool! (Thanks DanE!)
In my user studies, programmers used the debugger far less than I had expected. Part of that could perhaps be due to poor training in how to use a debugger — it is rare to get good training in how to use a debugger.
However, I think the answer is simpler than that: it is just plain boring and tedious to use a debugger. One guy did solve a thorny problem by stepping through the debugger, but he had to press “step over” or “step into” ninety times.
And when you are stepping, you must pay attention. You can’t let your mind wander, or you will miss the event you are watching for. I can’t be the only person who has done step, step, step, step, step, step, step, boom, “oh crap, where was I in the previous step?”
Omniscient debuggers are one way to make it less tedious. Run the code until it goes boom, then back up. Unfortunately, omniscient debuggers capture so much information that it becomes technically difficult to store/manage it all.
I suggest a compromise: store the last N contexts — enough to examine the state of variables back N levels, and to replay if desired.
I can imagine two different ways of doing this. In the first, the user still has to press step step step; the debugger saves only the state changes between the lines that the user lands on. In other words, if you step over the foo() method, the debugger only notes any state differences between entering and exiting the foo() method, not any state that is local to foo(). If the user steps into foo(), then it logs state changes inside foo().
In the other method, the user executes the program, and the debugger logs ALL the state changes (including in foo(), including calls to HashTable.add(), etc.). This is probably easier on the user, but probably slower to execute and requires more storage.
You could also do something where you checkpoint the state every M steps. Thus, if you get to the boom-spot and want to know where variable someVariable was set, but it didn’t change in the past N steps, you can
- look at all your old checkpoints
- see which two checkpoints someVariable changed between
- rewind to the earlier of the two checkpoints
- set a watchpoint on someVariable
- run until the watchpoint.
Permalink
04.01.09
Posted in Technology trends at 3:13 pm by admin
A while back, I hypothesized that women don’t go into computer science because it is a high-risk field. Today I want to share some anecdotal evidence from my own experience about how risky high-tech is. Since I turned 18, I have worked at many places, both as a contractor, summer intern, and regular full-time employee. Of the seven companies where I have a regular job, five ran out of money, one got bought, and one no longer makes the product I worked on. None of the companies I worked for were profitable when I left.
Regular jobs:
Contracts:
- NexGen (2 years). Bought by AMD. (This counts as an eventual success, but it took them eight years, three product cycles, and I-don’t-know-how-many-rounds-of-funding before they released their first product.)
- Cray Research (3 weeks). Bought by SGI, then sold to Tara Computer, which took over the Cray name.
- Data General (2 weeks). Bought by EMC.
- SGI (6 months). Filed for bankruptcy today.
- Apple (6 months). While the company still exists, they no longer make printers, which is the product I worked on.
- Sun Microsystems (6 months). In talks with IBM for IBM to buy Sun. Update: IBM didn’t buy Sun, but Oracle did.
- Triquest (1 month). Gone.
- Chicago Tribune (1 week). Still in business, but I wouldn’t bet on it being in business next year.
Summer jobs:
- EIT. Bought by VeriFone, which was then bought by HP. (This counts as a success!)
- Google. Still in business!
I realize that I’m probably on one long tail of the distribution, while my husband is on the other end. (Jim worked for Adobe for 16 years, and Adobe is still in business!) However, it shows that working in high-tech does have risks.
Permalink
11.02.08
Posted in review, Technology trends at 8:44 pm by admin
Ken Church and James Hamilton have a blog posting of Diseconomies of Scale which doesn’t seem right to me.
They suggest that it is more expensive to run a data center than to buy a bunch of condos, put 48 servers in each, wall up the servers, rent out the condo, and use the servers as a distributed data farm.
Capital costs
Church and Hamilton figure that the capital cost of a set of condos is about half of the cost of a data center. However, they use a sales price of $100K for the condos, but according to a news release issued by the US National Association of Realtors, the median metro-area condo price in the US was $226,800 in 2007 Q2.
Perhaps you could get cheaper condos by going outside the major metro areas, but you might not be able to get as good connectivity outside the major metro areas. I don’t have a figure for the median condo price in cities of population between 50,000 and 100,000, but that would be an interesting number.
A rack of 48 servers draws 12kW (at 250W/server, his number). For a house wired for 110V, this means that the current draw would be 109A. That is approximately five times what a normal household outlet can carry, so you would need to do significant rewiring. I don’t have an estimate on that cost.
Furthermore, my construction-industry source says that most houses can’t stand that much extra load. He says that older houses generally have 75A fuseboxes; newer houses have 100A to them. He says that it is becoming more common for houses to get 200A fuseboxes, but only for the larger houses.
While it is possible to upgrade, “ the cost of [upgrading from a 100A fusebox to a 200A fusebox] could be as high as $2000.” I suspect that an upgrade of this size would also need approval from the condo board (and in Canada, where residential marijuana farms is somewhat common, suspicion from the local police department).
If, as Church and Hamilton suggest, you “wall in” the servers, then you will need to do some planning and construction to build the enclosure for the servers. I can’t imagine that would cost less than $1000, and probably more like $2000.
They also don’t figure in the cost to pay someone (salary+plane+rental car+hotel) to evaluate the propterties, bid on them, and buy them. I wouldn’t be surprised if this would add $10K to each property. While there are also costs associated with acquiring property for one large data center, those costs should be lower than for aquiring a thousand condo.
Bottom line: I believe the capital costs would be higher than they expect.
Operating costs
For their comparison, Church and Hamilton put 54,000 servers either in one data center or in 1125 condos. They calculate that the power cost for the data center option is $3.5M/year. By contrast, they figure that the condo power consumption is $10.6M/year, but is offset by $800/month in condo rental income at 80% occupancy, for $8.1M in income. I calculate that 800*1125*12*.8 is $8.6M, and will use that number, giving a net operating cost of $2M/year, or $1.5M/year less than a data center.
Given a model with 1125 condos, this works out to $111 per condo per month. Put another way, if they are off by $111/month, a data center is cheaper.
Implicit in their model is that nobody will ever have to lay hands on a server rack. If you have to send someone out to someone’s residence to reboot the rack, that will cost you. That will depend upon how far the condo is from the technician’s normal location, how easy it is to find the condo, how agreeable the tenant is to having someone enter their condo, etc. I believe this would be a number significantly larger than zero. You might be able to put something in the lease about how the tenant agrees to reboot the servers on request, but this inconvenience will come at some price. Either it will come in the form of reduced monthly rent or a per-incident payment to the tenant.
I believe that the $800/month rental income ($1000/month rent less $200/month condo dues) is overly optimistic.
The median rent for a metro-area apartment was $1324 in 2007, but that’s for all housing types, not just condos. The median rent nationally in 2007 was $665. If you want the metro income, you need to pay the metro purchase price of $227K; if you want to save money by going outside the metro areas, you won’t get the high rent.
Eyeballing a few cities on Craigslist, it looked to me like the metro rent for two-bedroom condos was about $800/month.
Church and Hamilton also didn’t account for
- management fees (~5-7% of gross rent per month, or $50-$70 on income of $1000)
- property taxes (~2%/year of purchase price, so $160/month if he finds condos for $100K)
- maintenance that isn’t covered by the condo association like paint, new carpeting, and the hole the tenant punched in the wall ($40/month)
- reduction in rent for part of the condo being walled off and/or inconvenience of rebooting the servers (~$50/month)
- Liability insurance. If a short in the servers burns the condo complex down, that’s bad.
- Internet connectivity
While they didn’t account for Internet connectivity in either the datacenter scenario or the condos scenario, this is an area wherethere seem to be large economies of scale. For example, a T3 (44.736 Mbit/s) seems to cost between $7,500 and $14K/month or between $167 and $312/Mbit/sec/month. A T1 (1.536 Mbit/s) seems to cost between $550 and $1200/month or between $358 and $781/Mbit/s/month. The T1 is thus about twice as expensive per byte as the T3. I don’t know how much connectivity 54,000 servers would need, but I expect it would be significant, and expect that it would be significantly more expensive in 1125 smaller lots.
Non-quantifiable
I can imagine there some significant additional costs in time and/or money.
- Obstreperous condo boards. If the condo board passes a rule against having server farms in your unit, you’re screwed.
- Obstreperous zoning boards. If the city decides that the server farm is part of a business, they might get unhappy about it being in a building zoned residential.
- Criminal tenants. What’s to stop your tenants from busting into the server closet and stealing all the servers?
Church and Hamilton close their article by saying, “whenever we see a crazy idea even within a factor of two of what we are doing today, something is wrong”. I think they are correct, and that their analysis is overly simplistic and optimistic.
Permalink
07.29.08
Posted in Hacking, programmer productivity, Technology trends, Uncategorized at 11:01 am by admin
There’s a cool paper on a tool to do semi-automatic debugging: Triage: diagnosing production run failures at the user’s site. While Triage was designed to diagnose bugs at a customer site (where the software developers don’t have access to either the configuration or the data), I think a similar tool would be very valuable even for debugging in-house.
They use a number of different techniques to debug C++ code.
- Checkpoint the code at a number of steps.
- Attempt to reproduce the bug. This tells whether it is deterministic or not.
- Analyzes the memory by walking the heap and stack to find possible corruptions.
- Roll back to previous checkpoints and rerun, looking for buffer overflows, dangling pointers, double frees, data races, semantic bugs, etc.
- Fuzz the inputs: intentionally vary the inputs, thread scheduling, memory layouts, signal delivery, and even control flows and memory states to narrow the conditions that trigger the failure for easy reproduction
- Compare the code paths from failing replays and non-failing replays to determine what code was involved in that failure.
- Generate a report. This gives information on the failure and a suggestion of which lines to look at to fix it.
They did a user study and found that programmers took 45% less time to debug when they used Triage than when they didn’t for “real” bugs, and 18% for “toy” bugs. (“…although Triage still helped, the effect was not as large since the toy bugs are very simple and straightforward to diagnose even without Triage.”)
It looks like the subjects were given the Triage bug reports before they started work, so the time that it takes to run Triage wasn’t factored into the time it took. The time it took Triage to run was significant (up to 64 min for one of the bugs), but presumably the Triage run would be done in background. I could set up Triage to run while I went to lunch, for example.
This looks cool.
Permalink
« Previous entries Next Page » Next Page »