census maps mapplets

Posted in Hacking, Maps at 10:53 pm by ducky

James Macgill prodded me to turn my census maps into a mapplet, and so I finally made a census mapplet.

Most of you are probably wondering what a mapplet is. A mapplet is a Google map that has been encapsulated in a way that makes it easy to combine with other mashups. To see them, go to maps.google.com and select the My Maps tab. You’ll see a list of mapplets next to checkboxes.

I’ve been enjoying playing with combining my demographics maps with other mapplets, like

  • population density + sea level rise
  • various demographics + real estate listings
  • % black + Chcago Transit Authority lines


Open-sourcing code

Posted in Hacking, Maps at 5:49 pm by ducky

I just open-sourced the code for Mapeteria. If any of you are PHP4 gods, I have a few questions


Mapeteria: user-generated thematic maps

Posted in Hacking, Maps at 8:08 pm by ducky

A year ago, while I was in the midst of working on my Census Maps mashup, my Green College colleague Jana came up to me with a question. “I have a table of data about heat pump emissions savings for each province, and I want to make a map that colors each province based on the savings for that province. What program should I use to do that?”

I thought about all the work that I’d done for the Census Maps mashup — learning the Google Maps API, digging up the shape files for census tract boundaries, downloading and learning how to use the shapelib libraries to process the shapefiles, downloading and learning how to use gd, reacquainting myself with C++, reacquainting myself with gdb, debugging, trying to figure out why certain census tracts looked strange, etc, and rendered her an authoritative response: “Use Photoshop”, I said.

I was really dismayed that I had to tell her to use a paint program. Why should she — a geographer — have to learn about vertices and alpha channels and statically loaded libraries? Why wasn’t there some service where she could feed in a spreadsheet file and get back a map?

Well, I finally got tired of waiting for Google to do it, so developed Mapeteria — my own service for users to generate their own thematic maps.

If you give Mapeteria a CSV file (which is one of the formats that any spreadsheet program will be delighted to save as) plus a little more information about how it should be displayed, it will give you back a map. You can either get a KML file (which you can look at in Google Earth) or a Google Maps mashup that shows the map directly in your web browser.

So Jana, here’s your map!

Emissions savings of heat pumps vs. natural gas


Google Maps China

Posted in Maps at 9:02 pm by ducky

One of the valuable services that blogs do is to help publicize things. Well, it always takes me a while to remember/figure out where Google hides their China street maps, so I might as well help the rest of the world remember as well: it is at


Don’t ask me why you have to go there to find the maps, why you can’t get to them via http://maps.google.com. I don’t know.

(You can see street maps of Hong Kong and satellite imagry of everywhere on http://maps.google.com.)


Single Operation Multiple Data

Posted in Hacking, Maps at 5:40 pm by ducky

One of the most venerable types of parallel processing is called SIMD, for Single Instruction Multiple Data. In those types of computers, you would do the exact same thing on many different pieces of data (like add two, multiply by five, etc) at the same time. There are some problems that lend themselves to SIMD processing very well. Unfortunately, there are a huge number of problems that do not lend themselves well to SIMD. It’s rare that you want to process every piece of data exactly the same.

Google has done a really neat thing with their architecture and software tools. They have abstracted things such that it looks to the developer like they have a single operation multiple data machine, where an operation can be something relatively complicated.

For example, to create one of my map tiles, I determine the coordinates of the tile, retrieve information about the geometry, retrieve information about the demographics, and draw the tile. With Google tools, once I have a list of tile coordinates, I could send one group of worker-computers (A group) off to retrieve the geometry information and a second (B group) off to retrieve the demographic information. Another group (the C group) could then draw the tiles. (Each worker in the C group would use data from exactly one A worker and one B worker.)

The A and B tasks are pretty simple, and maybe could be done by an old-style SIMD computer, but C’s job is much too complex to do in a SIMD computer. What steps are performed depends entirely on what is in the data. For a tile out at sea, the C worker doesn’t need to draw anything. For a tile in the heart of Los Angeles, it has to draw lots and lots of little polygons. But at this level of abstraction, I can think of “draw the tile” as one operation.

Under the covers, Google is does a lot of work to make it look like everything is beautifully parallel. In reality, there probably aren’t as many workers as tiles, but the Google tools take care of dispatching jobs to workers until all the jobs are finished. To the developer, it all looks really clean and tidy.

There are way more problems that lend themselves to SOMD than to SIMD, so I think this approach has enormous potential.


Who are the maps for?

Posted in Maps, Random thoughts at 11:28 pm by ducky

As my maps approach something reasonable for public distribution, I’ve been talking to more people about them. People are starting to ask me, “Who do you think will use them? What do you think they will use them for?”

I’m not quite sure how to answer that. I imagine marketing people will be interested, though I have to believe that they already have this information.

Would researchers use it? Maybe for preliminary investigation, but I would hope they’d use ArcGIS for anything they want to publish. While the maps “look right” to me for most places I know about, there are a few places that don’t look right to me. ArcGIS is fundamentally better — they have many many more resources than I do to get things right.

The “value add” for my maps is not “better”, but “cheaper” and “more accessible”. Twelve-year old Katie isn’t going to buy a copy of ArcGIS for her social studies class, but maybe she could use my maps for a report on the racial demographics of Texas. The Southern Poverty Law Center probably isn’t going to buy ArcGIS, but might go create a list of links to prisons to help people understand how African-Americans are hugely overrepresented in U.S. jails. Maybe Frieda and Joe will look at it to figure out what neighborhoods in Chicago they’d like to live in.

But my hunch is that most of the “use” won’t be obviously useful. I have certainly spent an awful lot of time just wandering around in the maps, exploring the demographics of my native country. Was this productive?

My maps aren’t very good for giving me answers, but they have given me lots of questions. Why are there so few rural blacks in Florida, when there are so many just across the border in Georgia? Why are there so few Latinos in East Texas compared to West Texas? Why is the median age so low on so many Native reservations? Why are there so many vacant housing units in northern Michigan and Minnesota?

However, I feel like these are good questions to have. Maybe I can’t articulate why I feel like a richer person for having explored U.S. demographics, but I absolutely do.

And if Katie, and Frieda, and Joe, and the Southern Poverty Law Center also feel enriched, then I will feel like I have succeeded.


More advice to Google about maps

Posted in Maps, Technology trends at 11:06 pm by ducky

Because all the data associated with Google Maps goes through Google, they can keep track of that information. If they wanted to, they could store enough information to tell you what the most map markers within two miles of 1212 W. Springfield, Urbana, Illinois were. Maybe one would be from Joe’s Favorite Bars mashup and maybe one would be from the Museums of the World mashup. Maybe fifty would show buildings on the university of Illinois campus from the official UIUC mashup, and maybe two would be from Josie’s History of Computing mashup.

Google could of course then use that mashup data in their location-sensitive queries, so if I asked for “history computing urbana il”, they would give me Josie’s links instead of returning the Urbana Free Library. (They would need to be careful in how they did this in a way that didn’t tromp on Josie, if they want to stick to their “Don’t be evil” motto.)

This is another argument for why they should recognize a vested interest in making it easy for developers to add their own area-based data. If Google allows people to easily put up information about specific polygons, then Google can search those polygons. Right now, because I had to do my maps as overlays, Google can’t pull any information out of them.

If Google makes polygons and their corresponding data easy to name, identify, and access, they will be able to do very powerful things in the future.

Addendum: I haven’t reverse-engineered the Google Maps javascript — I realized that it’s quite possible that the marker overlays are all done on the client side.  (Desirable and likely, in fact.)  In that case, they wouldn’t have the data.  However, it would be trivial to insert some code to send information about the markers up to the server.  Would that be evil?  I’m not sure.


Disaster maps

Posted in Hacking, Maps, Technology trends at 2:27 pm by ducky

I was in San Jose when the 1989 Loma Prieta earthquake hit, and I remember that nobody knew what was going on for several days. I have an idea for how to disseminate information better in a disaster, leveraging the power of the Internet and the masses.

I envision a set of maps associated with a disaster: ones for the status of phone, water, natural gas, electricity, sewer, current safety risks, etc. For example, where the phones are working just fine, the phone map shows green. Where the phone system is up, but the lines are overloaded, the phone map shows yellow. Where the phones are completely dead, the phone map shows red. Where the electricity is out, the power map shows red.

To make a report, someone with knowledge — let’s call her Betsy — would go to the disaster site, click on a location, and see a very simple pop-up form asking about phone, water, gas, electricity, etc. She would fill in what she knows about that location, and submit. That information would go to several sets of servers (geographically distributed so that they won’t all go out simultaneously), which would stuff the update in their databases. That information would be used to update the maps: a dot would appear at the location Betsy reported.

How does Betsy connect to the Internet, if there’s a disaster?

  1. She can move herself out of the disaster area. (Many disasters are highly localized.) Perhaps she was downtown, where the phones were out, and then rode her bicycle home, to where everything was fine. She could report on both downtown and her home. Or maybe Betsy is a pilot and overflew the affected area.
  2. She could be some place unaffected, but take a message from someone in the disaster area. Sometimes there is intermittent communication available, even in a disaster area. After the earthquake, our phone was up but had a busy signal due to so many people calling out. What you are supposed to do in that situation is to make one phone call to someone out of state, and have them contact everybody else. So I would phone Betsy, give her the information, and have her report the information.
  3. Internet service, because of its very nature, can be very robust. I’ve heard of occasions where people couldn’t use the phones, but could use the Internet.

One obvious concern is about spam or vandalism. I think Wikipedia has shown that with the right tools, community involvement can keep spam and vandalism at a minimum. There would need to be a way for people to question a report and have that reflected in the map. For example, the dot for the report might become more transparent the more people questioned it.

The disaster site could have many more things on it, depending upon the type of disaster: aerial photographs, geology/hydrology maps, information about locations to get help, information about locations to volunteer help, topology maps (useful in floods), etc.

What would be needed to pull this off?

  • At least two servers, preferably at least three, that are geographically separated.
  • A big honkin’ database that can be synchronized between the servers.
  • Presentation servers, which work at displaying the information. There could be a Google Maps version, a Yahoo Maps version, a Microsoft version, etc.
  • A way for the database servers and the presentation servers to talk to each other.
  • Some sort of governance structure. Somebody is going to have to make decisions about what information is appropriate for that disaster. (Hydrology maps might not be useful in a fire.) Somebody is going to have to be in communication with the presentation servers to coordinate presenting the information. Somebody is going to have to make final decisions on vandalism. This governance structure could be somebody like the International Red Cross or something like the Wikimedia Foundation.
  • Buy-in from various institutions to publicize the site in the event of a disaster. There’s no point in the site existing if nobody knows about it, but if Google, Yahoo, MSN, and AOL all put links to the site when a disaster hit, that would be excellent.

I almost did this project for an MS thesis project, but decided against it, so I’m posting the idea here in the hopes that someone could run with it. I don’t foresee having the time myself.

Learning from maps

Posted in Maps, Random thoughts at 10:38 am by ducky

I’ve found some interesting things with my maps.

It is easy to find:

I was also surprised to see


Advice to Google about maps and data

Posted in Maps, Technology trends at 10:40 pm by ducky

I have been working on a Google maps mashup that has been a lot of work. While I might be able to get some benefit from investing more time and energy in this, I kept thinking to myself, “Google could do this so much better themselves if they wanted to. They’ve got the API, they’ve got the bandwidth, they’ve got the computational horsepower.”

Here’s what I’d love to see Google do:

  1. Make area-based mashups easier. Put polygon generation in the API. Let me feed you XML of the polygon vertices, the data values, and what color mapping I want, and draw it for me. (Note that with version 2 of the API, it might work to use SVG for this. I have to look into that.)
  2. Make the polygons first-class objects in a separate layer with identities that can feed back into other forms easily. Let me roll over a tract and get its census ID. Let me click on a polygon and pop up a marker with all the census information for that tract.
  3. Make it easy to combine data from multiple sources. Let me feed you XML of census tract IDs, data values, and color mapping, and tell you that I want to use census tract polygon information (or county polygons, or voting precinct polygons, or …) from some other site, and draw it for me.
  4. Host polygon information on Google. Let me indicate that I want census tract polygons and draw them for me.
  5. Provide information visualization tools. Let me indicate that I want to see population density in one map, percent white in another, median income in a third, and housing vacancy rates in a fourth, and synchronize them all together. (I actually had a view like that working, but it is computationally expensive enough that I worry about making it available.) Let me do color maps in two or three dimensions, e.g. hue and opacity.
  6. Start hosting massive databases. Start with the Census data, then continue on to the Bureau of Labor Statistics, CIA factbook information, USGS maps, state and federal budgets, and voting records. Sure, the information is out there already, but it’s in different formats in different places. Google is one of the few places that has the resources to bring them all together. They could make it easy for me to hook that data easily into information visualization tools.
  7. Get information from other countries. (This is actually tricky: sometimes governments copyright and charge money for their official data.)

Wouldn’t it be cool to be able to show an animation of the price of bread divided by the median income over a map of Europe from ten years before World War II to ten years after?

So how would Google make any money from this? The most obvious way would be to put ads on the sites that display the data.

A friend of mine pointed out that Google could also charge for the data in the same way that they currently charge for videos on Google Video. Google could either charge the visualization producers, who would then need to extract money from their consumers somehow, or they could charge the consumers of the visualizations.

Who would pay for this information? Politicians. Marketers. Disaster management preparedness organizations. Municipal governments. Historians. Economists. The parents of seventh-graders who desperately need to finish their book report. Lots of people.

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »