Refugee sponsorship

Posted in Canadian life, Politics at 7:39 pm by ducky

I lead a group which is sponsoring a refugee family.  Enough people have asked me how that works that I am compiling the answers here.

Legal Background

In Canada, there are three ways refugees come into the country:

  • Government sponsored, where the federal government provides all of the financial support for the first year and contracts with organizations called settlement agencies to provide the logistical support and hopefully some emotional support as well.  In BC, the main settlement agencies are MOSAIC and ISSBC.
  • Blended Visa Office Referral, where a charitable organization (frequently churches) bring in a family who they don’t know.  The organization — called a Sponsorship Agreement Holder or SAH — can do this by themselves or they can partner with a group of individuals (the sponsorship group), but the onus of vetting the sponsorship group and the legal liability lies with the SAH.
    • The SAH periodically gets an anonymized list of families approved for resettlement in Canada.  The entries usually give the family size, the ages of the children, sometimes the ages of the parents, their nationality, where they are now, if they have any special needs, and if there is a particular area they would like to go to.  (For example, they might have a cousin in Calgary or might really want to live near the ocean.)  The SAH communicates to the sponsorship group what families are available, and the sponsorship group will indicate if they are interested in sponsoring one of the family on the list.  The SAH will then communicate with Immigration, Refugees and Citizenship Canada (IRCC); IRCC decides who “gets” the family if more than one SAH expressed interest.
    • The SAH is legally responsible for 100% of the logistical and emotional support and slightly more than half of the financial support.
    • If there is a sponsorship group, the sponsorship group is morally responsible for what the SAH is legally responsible for.
    • The government provides 50% of the income support but not the start-up costs — furniture, staples, cleaning supplies, clothes, etc.
  • Privately sponsored, where a group of at least five Canadian citizens and permanent residents (a “Group of Five”)  OR a SAH enter into a legal agreement to bring in a family of known people.   Under this sponsorship type, the group is 100% responsible for financial, logistical, and emotional support for the family for one year.  (I call this the “let’s bring in grandma” sponsorship.)


Emotional and Logistical Support

I have mentioned emotional and logistical support multiple times.  What do I mean by that?

As an example, since we got the news of when they were going to arrive, our team has:

  • arranged temporary housing;
  • gotten them a phone and cell plan;
  • stocked their temporary housing with some food;
  • found a permanent apartment;
  • helped them fill out a massive number of forms (including the childcare tax credit and the medical services plan enrollment form);
  • helped them get Social Insurance Numbers (analogous to the US Social Security Number);
  • helped them open a bank account;
  • gotten them winter coats;
  • escorted the father to a medical appointment;
  • showed them how to use their debit cards to buy transit cards;
  • took them shopping for essentials (like underwear!);
  • helped them phone their friends back in the camps;
  • drove them to the local branch of their church;
  • did a lot of talking, orienting, and many other details too minor to call out explicitly.

In the next few days, we will:

  • co-sign the lease on their apartment;
  • move donated furniture from at least four different places into the apartment;
  • help them buy a small amount of furniture;
  • help them buy groceries and cleaning supplies;
  • help them register their child for school;
  • help them register for English classes;
  • show them how to use public transit;
  • help them get library cards;
  • help them get to eye and dental exams.

Longer-term, we will check in periodically to make sure they are adjusting well and give help as needed (e.g. to help mediate disputes or help them find trauma counseling if required), and help them find jobs.

Our particular experience

At the height of the publicity about the civil war in Syria, in late 2015, there was a huge outpouring of support for Syrian refugees.  I was not immune, and posted quietly on Facebook that I was thinking of sponsoring a family and immediately got a huge response.  Some people pledged money but couldn’t pledge time (because they lived elsewhere and/or had other obligations); some people pledged both.

I researched what was required and discovered that, because we didn’t know anybody personally, BVOR looked like the way to go.  I looked through the list of SAHs and found that the Canadian Unitarian Council (CUC) was a SAH that I thought would be easy for me to work with, so I started working with the Unitarian Church of Vancouver (UCV)’s Refugee Committee.

I had to prove to the UCV Refugee Committee that we were trustworthy, including routing at least 2/3 of the required funds to UCV before they would advise CUC to accept us.  (We put 100% of the amount, which helped show we were trustworthy.)  We also had to fill out some forms for CUC.

Unfortunately, by the time we got our act together in 2016, the Canadian government had let in as many refugees (45,000) as it had decided it was going to let in that year.  In 2017, the government set the BVOR quota very low, reserving most of the spots for private sponsorships, which were mostly for family members of the Syrians who arrived in 2015 and 2016.

(In mid-2017, I happened to be standing next to a TV in a deli where some MP was getting interviewed.  She said, “The number one question I get asked when I go back to my riding is, ‘Where are my Syrian refugees?'”  So we were not the only team waiting.)

2018 was a new year with new quotas, however.  Furthermore, the rest of the world had reminded Canada that Syria wasn’t the only place where things were bad.  So the BVOR list started getting populated again with families from different places.  There still weren’t a lot, but there were some.

So while in 2016 we planned on sponsoring a Syrian family of four, in January 2018 when we spotted an Eritrean family of three, we requested a match with them.  The government confirmed the match, and we sent in our paperwork on 30 Jan 2018.  On 2 Mar 2018, we got word that they would arrive on 7 Mar 2018.  Wheeee!  It was a bit of a scramble.

“Our” family

I don’t want to say much about the family we are sponsoring because there are privacy/security concerns and because refugees are in a very vulnerable position, not knowing the country, culture, or language.

I think I can disclose that the family had been in a camp in country X for EIGHT years.  (I am not clear on the details yet, but I think they might have been in a camp in country Y for a few years.)  They were not allowed to leave the camp, so their child had NO memory of anything except that camp.  There also were no TV or movies in the camp, so he didn’t even have any visual images of other places.  I can’t even imagine what it was like for him to see grassy fields and forests and snow-covered mountains and airplanes and stoplights and microwave ovens.

Mom and Dad don’t produce much English, but they can understand some English.  My husband has run errands with them with no translator, and by speaking slowly, directly, and simply, he can communicate.

Green Hills Welcoming Committee

Our team needed to have a name so that UCV could keep track of it as an entity, and we chose “Green Hills Welcoming Committee”.

Our time-donating team originally had six people on it.  One dropped out because of health issues; one dropped out due to logistical issues.  One husband has become more involved, and I picked up two team members from the UCV Refugee committee (including a former Eritrean refugee, who has contributed an enormous amount).

I have to say, we have an awesome team.  We have worked very well together, encouraged each other, trusted each other, and come through for each other.  We also have spread the load out so that no one person is overwhelmed.

  • Person A has an infant, so is limited in how much she can do hands-on.  She’s our researcher.  She figured out which forms we needed and filled out as much as possible before the family got here (and documented everything she found).  She’s made calls to figure out what we need to do to get the child enrolled in school and the parents enrolled in language classes.
  • Person B and Person C are hosting them in their house.  They have been taking care of hospitality things: feeding them, making them feel welcome, entertaining the child, etc.
  • Person D, the former refugee, has been doing the translating and introducing them to the local community.  (For example, he went to church with them.)   He’s also been doing the lion’s share of ferrying them from one place to another and helped a lot in the housing search.
  • Person E has been doing the bulk of the housing search, with significant help from Person D.
  • Person F has been doing a lot of the helping and coaching for things involving bureaucracy.  Person A got the forms ready and Person D can translate, but Person F is the one who has done the follow-through and gotten the forms signed and in the mail, and negotiated with the bureaucrats.   He’s also taken the family on errands when Person D wasn’t available.
  • Person G — the treasurer of the UCV refugee committee — has been the advisor.  She’s always been there to give advice on how to handle things or explain how something has to be done.

The timing is also really really fortunate: only one team member has a day job.  One has a night job, one is retired, one is on maternity leave, and three are between jobs/contracts.  (Myself, I got laid off on 15 Feb.)

So far, so good.


Why AI scares the s*** out of me

Posted in Hacking, Technology trends at 12:11 pm by ducky

There is a trope in our culture that sentient robots will to rebel someday and try to kill us all.  I used to think that fear was very far-fetched.  Now I think it is quaint.

I mean, we already live in a world of flying robots killing people. I don't worry about how powerful the machines are, I worry about who the machines give power to.

We don’t need to worry now about malevolent sentient AI killing us.  We’re going to first need to worry about malevolent sentient humans using weaponized AI to kill us.  After we survive the malevolent humans wielding AI, then maybe we can worry about malevolent sentient AIs.

Partly what worries me is the amazing speed of advances in AI.  There are incredible advances in natural language processing, as I mentioned in Mind Blown By Machine TranslationBoston Dynamics and others are making big advances in robotics; Tracking Point has developed AI-guided semi-automatic rifles; the US military is looking into swarming, self-guided drones.  I am certain that autonomous drones are going going to get very, very good at killing people in a very short amount of time.  There has already been at least one drone swarm attack.

At the same time as humans will become less needed in the military, they will become less needed in commerce.  If self-driving trucks can deliver packages which were robotically loaded at the warehouse, then UPS won’t need as many truck drivers and Amazon won’t need as many warehouse workers.  If an AI can spot cancer as well as dermatologists can, then we won’t need as many dermatologists.   If an AI can estimate insurance losses as well as humans, we won’t need as many insurance claims assessors.

There’s an immediate, obvious concern about what to do with a whole bunch of people once they don’t have jobs.  A number of people and organizations have been promoting basic income as an idea whose time has come, and there are a number of pilots, e.g. in Finland.  Note, however, that people who don’t have an income don’t have much power, so getting a basic income law passed after most people are out of a job might be difficult.  Traditionally, when there was gross inequality, the mob gathered pitchforks.  This has worked in part because the military is generally uncomfortable firing on civilians.

What happens when it is easy for robots to quickly kill anybody carrying a pitchfork?  Think about that for a second.

It gets worse.

CGP Grey has a video called Rules for Rulers, which I recommend watching.  Basically, rulers need to keep the people below them happy, which generally means “giving them stuff”.  They, in turn, need to keep the people below them happy.  If you don’t give enough stuff to the people below you, you are in danger of getting forcibly removed from your position.

If your country gets its wealth from its people, then you have to keep the masses happy or the country isn’t able to sustain enough wealth to keep everybody happy.  However, if  you only need a few people to generate the wealth of the country (e.g. diamond miners), then the masses are largely superfluous.  This is, Grey says, why resource-rich countries (like so many in Africa) are really awful places to live, and why the developed world is really a very nice place to live.

Okay, now let’s do a thought experiment.  If we get to a point where robots can do everything humans do, and the elites control the robots, then what do we need masses for?  What incentive do the 1% have for keeping the other 99% alive?  Do you really think that the 1%, who now own more than 50% of global wealth, are going to be moved to fund basic income out of the goodness of their hearts?  History does not suggest that they would be inclined to do so.  Mitt Romney complaining about 47% of Americans paying no income tax is an example of how some elites consider the masses to be freeloaders.  Romney expressed this opinion at a time when 49% of Americans are non-earners, but 94% of people below the poverty line are elderly, children, students, disabled, or family caretakers; what happens when a lot of able-bodied people are non-earners?   I guess the masses will just have to eat cake.

I don’t know if the elites would go so far as to willfully kill the masses, but  I can certainly imagine the elites neglecting the masses.  (With climate change, I can even imagine the elites thinking it would be a good thing if millions of people died off.  It would certainly reduce carbon consumption!)  Even if they aren’t malicious, they might look at the problem, say “there is nothing I can do”, wall themselves off, and pull up the drawbridge.

I am imagining that in 20 years, there just might be some really deadly pandemic with a very very expensive treatment.  And the elites might not prioritize finding an inexpensive cure for people outside of their social circle.  Oh, don’t think that could happen?  Check out the history of HIV treatment.


P.S. — I am a little nervous about posting this.  If some AI in the future tries to figure out who the troublemakers are (so that its bosses can exterminate all the troublemakers), this post maybe will mark me for execution.  🙁


Cajun Vegan Red Beans and Rice Recipe

Posted in Random thoughts at 9:39 pm by ducky

I had a Significant Other many years ago who was from Louisiana, and taught me to love Red Beans and Rice.  Later, I became a vegetarian.  That was mostly okay, but I missed Red Beans and Rice.  I eventually got tired and worked out a vegetarian version and was really happy with how it turned out.  Here’s my recipe:

Soak 1 lb dried red beans overnight.

After they are well and truly soaked, drain off the water and put them in a slow cooker  Cover them with water or broth. (I really like Better Than Bullion goo; I use about 2 big teaspoons for one batch of this recipe.)

Chop two sausages of Tofurky Andouille sausage (comes in 4-packs) into thin disks and brown lightly.

Then saute with the sausage:

  • 4 diced celery stalks
  • 1 diced onion
  • 1/2 to 1 green pepper
  • 6 cloves garlic

After you’ve sauteed all the stuff, add it to the slow cooker.

Also toss into the slow cooker:

  • 1t salt
  • 2t white pepper
  • 1 bay leaf
  • 1 waaay heaping teaspoon of smoked paprika
  • 3/4 t of cayenne pepper
  • about 2cm of jalapeno
  • 20 turns of a black pepper mill (I think this works out to about 2t)

Everybody’s slow cooker is going to be different, but I think mine takes about six hours on high.   It’s done when the beans are mushy.  For authenticity, at some point when the beans start to get mushy, smash 1/4 of them against the side of the slow cooker.  This makes the stew thicker and mixes more of the bean flavour into the liquid.

Serve over rice.


“Red beans” are a specific type of beans. Beans which happen to be red, like kidney beans, are not the same. If you can’t find dried red beans, you can probably find canned red beans, but they are more expensive. (You don’t have to soak them overnight, however.)

The Tofurkey Andouille sausage is really important for getting the taste right.  Other kinds of vegetarian sausage won’t give the right taste.  Tofurkey Andouille sausage is slightly hard to find, but our Whole Foods carries it.

The beans freeze well, but the rice does not.


Mind Blown By Machine Translation

Posted in Technology trends at 12:10 am by ducky

I have been studying machine learning lately, and have come across three recent research findings in machine translation which have each blown my mind:

  1. Computers can learn the meanings of words.
  2. Computers can make pretty good bilingual dictionaries given only large monolingual sets of  words (also known as “a corpus”) in each of the two languages.
  3. Computers can make sort-of good sentence-level translations given a bilingual dictionary made by #2.

Learning the meanings of words

Imagine that you could create a highly dimensional coordinate space representing different aspects of a word.  For example, imagine that you have one axis which represents “maleness”, one axis that represents “authority”, and one axis which represents “tradition”.  If the maximum value is 1 and the minimum 0, then the word “king” would thus have coordinates of (1, 1, 1), while the word “queen” would have coordinates (0, 1, 1), “duke” would maybe be (1, .7, 1), “president” would maybe be (1, 1, .6).  (If Hillary Clinton had been elected U.S. president, then maybe the maleness score would drop to something around .8).

You can see that, in this coordinate space, to go from “woman” to “man”, or “duchess” to “duke”, you’d need to increase the “maleness” value from 0 to 1.

It turns out that it is relatively easy now to get computers to create coordinate spaces which have hundreds of axes and work in exactly that way.  It isn’t always clear what the computer-generated axes represent; they aren’t usually as clear as “maleness”.  However, the relative positioning still works: if you need to add .2 to te coordinate at axis 398, .7 to the one at axis 224, and .6 to the one at axis 401 in order to go from “queen” to “king”, if you add that same offset (aka vector) — .2 at axis 398, .7 at axis 224, and .6 at axis 401 — to the coordinates for “woman”, then the closest word to those coordinates will probably be “man”.   Similarly, the offset which takes you from “Italy” to “Rome” will also take you from “France” to “Paris”, and the offset which takes you from “Japan” to “sushi” also takes you from “Germany” to “bratwurst”!

A function which maps words to coordinate spaces is called, in the machine learning jargon, “a word embedding”.  Because machine learning depends on randomness, different programs (and even different runs of the same program) will come up with different word embeddings.  However, when done right, all word embeddings have this property that the offsets of related words can be used to find other, similarly related word pairs.

IMHO, it is pretty amazing that computers can learn to encode some information about fundamental properties of the world as humans interpret it.  I remember, many years ago, my psycholinguistics professor telling us that there was no way to define the meaning of the word “meaning”.  I now think that there is a way to define the meaning of a word: it’s the coordinate address in an embedding space.

As I mentioned before, it’s surprisingly easy to make good word embeddings.  It does take a lot of computation time and large corpuses, but it’s algorithmically simple:

  1. Take a sentence fragment of a fixed length (say 11) and have that be your “good” sentence.
  2. Replace the middle word with some other random word, and that’s your “bad” sentence.
  3. Make a model which has a word embedding leading into a discriminator.
  4. Train your model to learn to tell “good” sentences from “bad” sentences.
  5. Throw away the discriminator, and keep the word embedding.

In training, the computer program iteratively changes the word embedding to make it easier for the discriminator to tell if the sentence is “good” or “bad”.  If the discriminator learns that “blue shirt” appears in good sentences, that “red shirt” appears in good sentences, but “sleepy shirt” does not appear in good sentences, then the program will move “blue” and “red” closer together and farther from “sleepy” in the word embedding.

Christopher Olah has a good blog post which is more technical (but which also covers some additional topics).

Computers can make bilingual dictionaries with monolingual corpuses

A recent paper showed how to make pretty decent bilingual dictionaries given only monolingual corpuses.  For example, if you have a bunch of English-only text and a bunch of French-only text, you can make a pretty good English<->French dictionary.  How is this possible?!?

It is possible because:

  1. words in different languages with the same meaning will land at (about) the same spot in the embedding space, and
  2. the “shape” of the cloud of words in each language is pretty much the same.

These blew my mind.  I also had the immediate thought that “Chomsky was right!  Humans do have innate rules built in to their brains!”  Upon further reflection, though, #1 and #2 make sense, and maybe don’t imply that Chomsky was right.

For #1, if the axes of the word embedding coordinate space encode meaning, then it would make sense that words in different languages would land at the same spot.  “King” should score high on male/authority/tradition in Japanese just as much as in English.  (Yes, there could be some cultural differences: Japanese makes a distinction between green and blue in a different place in the colour spectrum than English does.  But mostly it should work.)

For #2, language represents what is important, and because we share physiology, what is important to us is going to be very similar.  Humans care a lot about gender of animals (especially human animals), so I’d expect that there to be a lot of words in the sector of the coordinate space having to do with gender and animals.  However, I don’t think humans really care about the colour or anger of intellectual pursuits, so the sector where you’d look for colourless green ideas sleeping furiously ought to be empty in pretty much every language.

The way the researchers found to map one word embedding to another (i.e. how they mapped the embedding function one program found for French to one they found for English) was they made the computer fight with itself.  One piece acted like a detective and tried to tell which language it was (e.g. was the word French or English?) based on the coordinates, and one piece tried to disguise which language it was by changing the coordinates (in a way which preserved the relational integrity).  If the detective piece saw a high value in a word’s coordinate which didn’t have high values in English, then it would know it was French.  The disguiser then learned to change the French coordinate space so that it would look more like the English coordinate space.

They then refined their results with the Procrustes algorithm to warp the shape of the embedding spaces to match.  They picked some high-occurrence words as representative points (since high-occurrence words like “person” and “hand” are more likely to have the same meaning in different languages), and used those words and their translations to figure out how to bend/fold/spindle/mutilate the coordinate spaces until they matched.

Computers can translate sentences given a dictionary

The same research group which showed how to make dictionaries (above) extended that work to machine translation with only monolingual corpuses (In other words, no pre-existing hints of any kind as to what words or sentences in one language corresponded to words or sentences in the other language.)  They did this by training two different models.  For the first model, they took a good sentence in language A and messed it up, and trained the model to fix it.  Then once they had that, they fed a sentence in language B into a B->A dictionary (which they had created as described above) to get a crappy translation, then fed it into the fixer-upper model.  The fixed up translation wasn’t bad.  It wasn’t great, especially compared to a human translator, but it was pretty freakin’ amazing given that there was never any sort of bilingual resource.


When I read A Hitchhiker’s Guide to the Galaxy, I scoffed at the babelfish.  It seemed completely outlandish to me.  Now, it seems totally within the realm of possibility.  Wow.



The Perfect To-Do List Manager

Posted in Hacking, Random thoughts, Technology trends at 5:11 pm by ducky

There are a huge number of to-do list managers (TDLMs) out in the world now, but none of them do what I want.  Apparently, it’s not just me: I just read an article which said that when students were asked what mobile apps they really wanted, 20% said they wanted “a comprehensive to-do + calendaring + life management app that helps them better organize their lives”.  TWENTY percent!

Is it really that hard?

I have strong opinions about what I want, and I don’t think it’s that hard, so I will describe my desires here in the hopes that somebody will make me the perfect TDLM.  (Some of the features you can see in a TDLM which I wrote for a class project.  Sometimes I think about writing my perfect TDLM, but I’m busy with other things.  I want it to exist, not to write it myself.)

The most important thing is that the TDLM should make you feel better about your tasks.  The biggest problem with TDLMs right now is that they make you feel guilty: the list grows and grows and grows because there are an infinite number of things it would be nice to do and only a finite amount of time.  This means that every time you open the TDLM, you feel overwhelmed by guilt at all the things you haven’t done yet.

1. Hide stuff you can’t work on right now because of blocking tasks.  Don’t show me “paint the bedroom” if I haven’t finished the task of “choose colour for bedroom”.  (This means you need UI for showing what tasks depend upon which other tasks, and I warn you that’s not as easy as you think.)

2. Hide stuff you won’t work on right now because you are busy with other things.  Don’t show me “paint the bedroom” if I have decided that I’m not going to start that project until I finish doing my taxes.  “Do taxes” is not truly a blocking task — it’s not like I am going to use the tax forms to apply the paint — but hide it anyway. (This means you need UI for showing what the sequencing of tasks is.)

3. Hide stuff you won’t work on right now because it is the wrong time of year.  Maybe you want a task of “buy new winter jacket”, but you want to wait until the end of winter to get take advantage of the sales on coats.  You should to be able to tell your TDLM to hide that task until March.  (Or until May, if you live in Manitoba.)  Or “rotate tires” is something which only needs to happen every six months.

Note that this implies connecting the TDLM to a calendar, at least minimally.

4. Allow recurring to-do list items.  I don’t want to have to make a new task for our wedding anniversary every year.  I want to set it once and forget it.  Usually people put things on their calendars for repeating events, but “Wedding Anniversary” goes on August 22ns and is not a task.  “Plan something for anniversary” is a recurring task but should be hidden until at about August 1st.

The TDLM should distinguish between recurring tasks which expire and those which do not.  Non-expiring tasks are ones like “pay phone bill”.  Even if you forget to pay it by the due date, you still need to deal with it.  On the other hand, “run 2km” is an expiring item: if you couldn’t do your 2km run on Monday, it probably does not mean that you should run 4km on Wednesday.

5. Make me feel super-good about finishing tasks.  A lot of TDLMs handle checking something as done by making it disappear.   This is the worst.  I’ve spent hours, weeks, or months looking at that dang task, and when I finally finish it, I want to savour the moment!  I want my TDLM to cheer, to have fireworks explode on the screen, and maybe even have the text of the task writhe in agony as it dies an ugly painful death.  I want there to be a display case in my TDLM of things that I have finished recently that I can look at with pride.  “Yeah”, I can think, “I am ***DONE*** with painting the bedroom!”  Maybe I don’t need full fireworks for a simple, one-step task which took 15 minutes, but if it was a 2000-step task which took 5 years (like getting a PhD or something like that), I want the TDLM to cheer for a full five minutes.

6. Let me see what I did. Sometimes, I feel like I didn’t get anything done, and it is reassuring to look at a list of the things that I actually did accomplish.  It might be nice to show it in a horizontal latest-first timeline form:

  • 4:47 pm Laundry
  • 3:13 pm Groceries
  • 12:11 pm Replace laptop display
  • (etc)

I would also like to be able to modify the task completion times.  “Oh, I actually finished replacing the laptop last night, I just didn’t feel like telling the TDLM because it was late and I was tired.”

7. Let me see what I am going to do.  People usually use calendars for this, but as I mentioned before, calendars are kind of the wrong tool.  I don’t really want to see “buy birthday present for Mom” in the same place as “Meet with boss, 10:30 AM”.  Plus, a strict time-base is makes zero sense if the dependencies are other tasks.

8. Let me import/modify/export task hierarchies.  Suppose you want to have a wedding.  (Mazel Tov!)  There are predictable things which you need to do: book a space for the wedding, a space for the reception, book an officiant, book a caterer, choose a menu, etc.   If, say, you want a wedding sort of like your friend Joanne’s, it would be nice if Joanne could email you the hierarchy of tasks that she did for her wedding, and you could just drop it in to your TDLM.  (Perhaps that way, you wouldn’t forget to rent a dance floor.)

But maybe you have some Greek heritage and Joanne does not, so you need to add “get a stefana” to your list.  You should be able to do that — and then export your new wedding task list for your brother when he gets married.  Even better, you ought to be able to upload it to a site which hosts lots of packaged tasks, maybe even a whole section on weddings (so your brother could pick and choose which wedding task list he likes best).

Needless to say, the exported task hierarchy should be in a form which lends itself well to version control and diffing.  🙂

9. Let me share my task list with other people.  I would like to be able to share my “home” task list with my husband, so that he could assign me tasks like “buy three kitchen sponges”.  Ideally, I’d think I’d like for there to be three task lists: his, mine, and ours.

My husband and I would probably set things up to both have read/write permission on all three — there are some things that only one of us can or should do.  I can imagine other couples might want to not have write permission on each other’s, only on the “ours” one.

10. Make it easy to discuss tasks.  This means assigning a simple ID and URL to the task.  If Jim and I are going to share tasks, we are going to discuss them.  It would be nice to be able to say, “Task #45” instead of “that one about the paintbrushes”.  It would also be nice to be able to email a link to him which will take him right to Task #45.

11. (Nice to have) Allow putting a time estimate on the task.  If you know that it takes you about two hours to get to your locker, change clothes, stretch, run 2km, stretch, shower, change clothes, and get back to your workplace, then it might be nice to put in an estimate for the “run 2km” task.

If you can put a time estimate on a task and adjust it later, the TDLM could keep track of estimated vs. actual, and start to help you adjust your estimates.  “For tasks which you estimate are going to be about 3hrs, you spend an average of 4.15 hrs.”

It would also be nice if the TDLM could help you make estimates based on similar tasks which you completed.  When entering an estimate for painting the living room, it would be nice if the TDLM mentioned, off to the side, how long it took you to paint the bathroom and the bedroom.  (It’s even okay if it also tells you how long it took you to paint the landscape or your fingernails; presumably you’d be smart enough to figure out which tasks were relevant.)

12. (Nice to have) Make the TDLM geo-aware.  It would be kind of nice to be able to hide tasks until I was at or near a particular location.  For example, if I am not in a big hurry to paint the bedroom, hide “buy paint” until I am actually at the paint store.

Something requested by the students in the article I mentioned earlier was being told to leave in order to make it to the next appointment.  “Doctor’s appointment at 3pm” is a calendar event, but “get to doctor’s office” is a task which needs to happen at a time which depends upon how long it takes to get to the doctor’s office from where you are.  That’s another way that geo-awareness could be useful.

13. (Maybe nice to have) Be able to mark urgency.  I am not actually certain how useful this is.  I have had TDLMs which allowed me to mark urgency, and I found that I almost never used it.  I think people will expect it, however.

14. (Nice to have, but difficult) Integrate with my applications.  Tasktop Technologies has a product called Tasktop Dev, which kept track of what you did in the source code editor (and some other applications, e.g. web browser and Microsoft Office) while you were working on a specific task.  (You had to tell it, “now I am working on task #47” so that it would know to start watching.)  Then, there was a record of what you worked on for that task.  That was useful if you needed to stop and restart the task (especially over a long period of time), or if you needed to go back a long time later and see what you had done.  (“What was the URL of that caterer with the really nice cheesecake?”)

In a work environment, it would be nice to integrate it with other task management systems (AKA “bug trackers”) like Jira or Asana or Bugzilla.

This is what I want.  If it persists in not existing, I might have to do it myself someday.


Unemployment map

Posted in Maps at 8:54 pm by ducky

I have developed some maps which show seasonally adjusted unemployment rates, by month, by county, for the past 23 years, as either a cartogram or as a standard mercator projection.

One of the stories that my unemployment map tells clearly is just how bad the financial meltdown in 2008 was, and how sudden.

This might surprise you if you saw the video of unemployment by county by LaToya Egwuekwe which went viral.  It is a fine video for what it is, but I think it is slightly misleading.

Her video showed pretty clearly that things started to slide a little bit in late 2008, but the real rise in unemployment hit in 2009, with the worst being in June 2010.


Still from unemployment video by LaToya Egwuekwe

I was quite surprised when I saw this, as it didn’t match what I knew of the situation.

Below is what the national unemployment rate actually looked like over time. (The seasonally adjusted rate is darker; the unadjusted rate is lighter.)


The seasonally adjusted unemployment rate actually peaked in October 2009, not in mid-2010.

Why the difference?  If you look at the fine print of Egwuekwe’s video, it is a map of the 12-month rolling average of unemployment, which is a lagging indicator.  This means that, for example, in October 2008, right after the financial meltdown, the unemployment numbers she used for the map included November 2007 through September 2007 — which were generally pretty good.  Similarly, the unemployment later seemed to be higher than it really was because the rolling average included the previous year — which included very high unemployment.

Here is a comparison of the seasonally adjusted unemployment rate and the 12-month rolling average for 2007 through the present:seasonalVsRolling

It is perfectly understandable that Ms. Egwuekwe would use the 12-month rolling average instead of the seasonally-adjusted unemployment.  Why?  Because the Bureau of Labor Statistics does not publish seasonally adjusted unemployment rates at the county level, and it is a royal pain to calculate the seasonally adjusted unemployment for each county individually.  (I should know, because I did calculate seasonally adjusted unemployment for each county individually.)

She could have used the unadjusted unemployment rate for a county, but there can be so much seasonal variation that it is hard to see the longer-term trends.  This is particularly true in agricultural communities.  For example, here is a graph which shows the unadjusted unemployment rate for Clark County, IL in light blue, and the seasonally adjusted rate in darker blue.


(For comparison, the national unemployment rate is in light green.)

Thus, if Ms. Egwuekwe had used the raw unadjusted unemployment numbers for her video, the short-term fluctuations would have made it difficult to see the longer-term trends.  It was a reasonable tradeoff to make.

One other complaint I have about maps like hers, and well, almost all thematic maps, is that they give too much visual weight to rural areas compared to urban areas.  When you see a map like the one at the top of the post, you probably do sort of a visual averaging to figure out what the overall unemployment is like.  However, because there are a huge number of people in cities, and because cities are very small compared to the vast stretches of rural areas, what you see in rural areas dominates the map.  A population-based cartogram — where jurisdictions are distorted so that the area of a jurisdiction is proportional to its area — gives a map which is less misleading.

Again, it’s completely understandable that people would normally use maps which show area instead of population.  It’s a royal pain to generate a cartogram.  (I should know, I did generate cartograms.)

Here is a population-based cartogram of the unemployment in June 2008, before the financial crisis hit, when the US seasonally adjusted unemployment rate was 4.6%:


(Green is good; yellow is bad.  Full green is 0% unemployment; full yellow is 17% unemployment.  States are outlined.)

Now here is an image from the worst month after the financial meltdown, October 2009:



The financial crisis hit fast, and it hit hard.

The good news is that it is getting better.  Here is a map of the latest month which I have data for, June 2013:




Note: it is a little difficult to recognize places when they are distorted.  On my unemployment maps web page, you can show city names; you can click on a county to get more information about that county.



City Labels on Maps

Posted in Maps at 6:35 pm by ducky

When I watched people look at my cartograms, I saw that they frequently had trouble figuring out which distorted shapes on the map corresponded to which shapes on the more familiar non-distorted maps they were familiar with.

Cartogram without labels

Clearly, I needed to give them some reference points.

The first thing that I thought of doing was distorting Open Street Map tiles to match the cartographic distortion.  That didn’t work out so well, and I decided that I could get 90% of the benefit just by showing city names.

The next thing I tried was to make tiles with city names on them.  This turned out to be difficult because city names frequently crossed tile boundaries, and the city names were variable widths.

What worked: I made markers with custom icons, where the icon was an image of the city name, and placed the icons at the appropriate location on the cartogram.  This worked well: the city names moved with the background image when dragged, and creating the custom icons was quite lightweight.

Having solved how to show city names, I then needed to figure out when to show city names.  Clearly I should show bigger cities first, but if you zoom in, you want to see smaller cities.  I don’t know how Google does it, but they probably can afford to decide on a city-by-city basis at which zoom level that city should appear, but there are an awful lot of cities and an awful lot of zoom levels, and my name’s not Google.  I can’t afford to do that.

I experimented with coming up with a formula to specify what population threshold to use for which zoom level, but I was unsatisfied with that.  I couldn’t find a formula which would show enough cities in lightly-populated areas but not too many in densely populated areas.

The next thing I tried was to figure out which area of the map was showing, and to label only the top six (by population) visible cities. This means that you see only really big cities when zoomed out a lot:

Six biggest cities labelled

But when you zoom in (or move so that one of the labelled cities stops being in the view), more cities appeared:

(When I did this, I was surprised to find out how small (in population) some major cities are.  Jacksonville, FL is bigger than Miami.  Indianapolis is bigger than Boston.  El Paso is bigger than Seattle.  Now, partly that’s because I’m labelling cities and not metro areas, but it still surprised me.)

Even only showing six, there were still times when the cities names got crowded.  Also, when way zoomed out, big sections of the country didn’t have any labels.  What I finally did was look at the top 23 visible cities, and if there was a larger city nearby (in pixels), I skipped the smaller city.  This seems to work really well:

It sure beat keeping a list of which cities to show at which zoom level!



Cartographic “suicide caucus” map

Posted in Art, Maps at 10:13 am by ducky

Ryan Lizza posted an article and map in the New York Times which showed the locations of the US Congressional Districts whose Representatives backed the US federal government’s shutdown in an attempt to defund Obamacare.  Here is a version of the map which I made, with yellow for the “suicide caucus”:


The article and map were good.  I liked them.  But there’s a real danger when looking at a map that you will — consciously or unconsciously — mentally equate the relative number of pixels on a map into the relative number of people.  

Unfortunately, the geographical distribution of people is wildly, wildly uneven: from 0.04 people per square mile in the Yukon-Koyukuk Borough to more than 70,000 people per square mile in Manhattan.  Yes, there are 1.75 MILLION more people per square mile in Manhattan than rural Alaska.

The map above makes it look like a higher percentage of congresspeople supported the shutdown than actually did.  If you look at the shutdown districts on a cartogram — a map where the area of a congressional is distorted to be proportional to the population of that district — instead, it becomes even more clear just how few representatives were involved.


I have made a web page where you can explore congressional districts yourself.

In addition to seeing the maps above, you can also see thematic maps (both cartogram and regular) of

  • percent without insurance
  • percent white
  • median family income
  • median gross rent
  • median home value
  • percent living in poverty
  • percent of children living in poverty
  • percent of elderly living in poverty
  • median age
  • congressional election results from 2012

Additionally, if you click on a congressional district, you can see who represents that district, plus all of the above values for that district.  If you click on the map away from a congressional district, you can see a table comparing the shutdown districts with the non-shutdown districts.

You can also look at maps for the presidential 2012 election results and seasonally-adjusted unemployment, but because those are county-based figures, you can’t do a strict comparison between shutdown/non-shutdown districts, so they aren’t in the comparison table or the per-district summaries.

Implementation notes

I used ScapeToad to generate the cartograms.  It was a lot of trial and error to figure out settings which gave good results.  (For those of you who want to make cartograms themselves: I defined cartogram parameters manually, and used 6000 points for the first grid, 1024 for the second, and 3 iterations.)

I used QGIS and GRASS to clean it up afterward (to remove slivers and little tiny holes left between the districts sometimes) and to merge congressional districts to make cartogram shapes.

NB: I use the state boundaries which I created by merging the cartogramified congressional districts, even for the maps which are based on counties (e.g. unemployment and the presidential results).  It is pretty impressive how well the merged congressional district state boundaries match the county cartogram state borders.  It wasn’t at all obvious to me that would happen.  You could imagine that ScapeToad might have been more sensitive to the shapes of the counties, but somehow it all worked.  Kudos to ScapeToad!

At some zoom levels, not all the district boundaries get drawn.  That’s because I don’t want the map to be all boundary when way zoomed out, so I check the size before drawing boundaries.  If the jurisdiction is too small, I don’t draw the boundary.

As a starting point, I used the Congressional District shapefiles from the US Census Bureau. For the population used for generating the cartogram, I used the Census Bureau American Community Survey 2011 values.  For the other map attributes, I specify the source right under the “Color areas by”.

I made the map tiles myself, using custom PHP code.  You can read more about it in Optimizing Map Tile Generation.  I came up with my own algorithm for showing city labels.


Google Glasses app to help autistic people?

Posted in Random thoughts, Technology trends at 2:11 pm by ducky

I have heard that looking at faces is difficult for people with autism. I don’t understand it, but the impression I gotten from reading descriptions from high-functioning adults that the facial recognition hardware has a bug which causes some sort of feedback loop that is uncomfortable.

What if there was a Google Glasses application which put ovals in front of people’s faces? Blue ovals if they were not looking at you, pink ovals if they are. Maybe a line to show where the center line of their face is.

Maybe that would make it more comfortable to be around collections of people.


Variably-sized points on maps

Posted in Hacking, Maps at 1:20 am by ducky

The Huffington Post made a very nice interactive map of homicides and accidental gun deaths since the shooting at Sandy Hook.  It’s a very nice map, but has the (very common problem) that it mostly shows where the population density is high: of course you will have more shootings if there are more people.

I wanted to tease out geographical/political effects from population density effects, so I plotted the gun deaths on a population-based cartogram.  Here was my first try.  (Click on it to get a bigger image.)

Unfortunately, the Huffington Post data gives the same latitude/longitude for every shooting in the same city.  This makes it seem like there are fewer deaths in populated areas than there really are.  So for my next pass, I did a relatively simple map where the radius of the dots was proportional to the square root of the number of gun deaths (so that the area of the dot would be proportional to the number of gun deaths).



This also isn’t great.  Some of the dots are so big that they obscure other dots, and you can’t tell if all the deaths were in one square block or spread out evenly across an entire county.

For the above map, for New York City, I dug through news articles to find the street address of each shooting and geocoded it (i.e. determined the lat/long of that specific address). You can see that the points in New York City (which is the sort of blobby part of New York State at the south) seem more evenly distributed than for e.g. Baltimore.  Had I not done that, there would have been one big red dot centered on Manhattan.

(Side note: It was hugely depressing to read article after article about people’s — usually young men’s — lives getting cut short, usually for stupid, stupid reasons.)

I went through and geocoded many of the cities.  I still wasn’t satisfied with how it looked: the size balance between the 1-death and the multiple-death circles looked wrong.  It turns out that it is really hard — maybe impossible — to get area equivalence for small dots.  The basic problem is that with radiuses are integers, limited by pixels.  In order to get the area proportional to gun deaths, you would want the radius to be proportional to the square root of the number of gun deaths, or {1, 1.414, 1.732 2.0, 2.236, 2.449, 2.645, 2.828, 3.000}, the rounded numbers will be {1, 1, 2, 2, 2, 2, 3, 3, 3}; instead of areas of {pi, 2*pi, 3*pi, 4*pi, …}, you get {pi, pi, 4*pi, 4*pi, 4*pi, 9*pi, 9*pi, 9*pi}.

Okay, fine.  We can use a trick like anti-aliasing, but for circles: if the square root of the number of gun deaths is between two integer values (e.g. 2.236 is between 2 and 3), draw a solid circle with a radius of the just-smaller integer (for 2.236, use 2), then draw a transparent circle with a radius of the just-higher integer (for 2.236, use 3), with the opacity higher the closer the square root is to the higher number. Sounds great in theory.

In practice, however, things still didn’t look perfect.  It turns out that for very small dot sizes, the dots’ approximations to circles is pretty poor.  If you actually zoom in and count the pixels, the area in pixels is {5, 13, 37, 57, 89, 118, 165, …} instead of what pi*R^2 would give you, namely {3.1, 12.6, 28.3, 50.3, 78.5, 113.1, 153.9, …}.


But wait, it’s worse: combine the rounding radiuses problem with the problem of approximating circles, and the area in pixels will be {5, 5, 13, 13, 13, 13, 37, 37, 37, …}, for errors of {59.2%, -60.2% -54.0% -74.1% -83.4% -67.3% -76.0%, …}.  In other words, the 1-death dot will be too big and the other dots will be too small.  Urk.

Using squares is better.  You still have the problem of the rounding the radius, but you don’t have the circle approximation problem.  So you get areas in pixels of {1, 1, 4, 4, 4, 9, 9, 9, …} instead of {1, 2, 3, 4, 5, 6, 7, 8, …} for errors of {0.0%, -50.0%, 33.3%, 0.0%, -20.0%, -33.3%, 28.6%, …} which is better, but still not great.

Thus: rectangles:

Geocoding provided by GeoCoder.ca


« Previous entries Next Page » Next Page »