Monday, October 19, 2015

The Spam Filter Apocalypse

Spam filters almost prevented these gorgeous photo ops.
A few weeks ago, a friend and I were planning to go apple-picking when I noticed she was curiously silent in the group planning email thread. I texted her to inquire. She immediately texted back, "What thread?"

I forwarded her the thread and, after checking that it had included her, asked if perhaps the message had gotten caught in her spam filter. She said no.

The next day in the car, she revealed that the message had gotten caught not in her university spam filter, but in her Gmail spam filter.

This was troubling. University spam filters were widely known the usual culprit for missing emails. In university-land, the best excuse for failing to respond to an email is to say the message "somehow" got "stuck." Who knows what decade the technology was from? Who knows what kinds of dark corners there are, waiting to eat important work emails and social invitations alike?

Gmail, on the other hand, is another story. It is generally acknowledged to be the state of the art when it comes to spam filters. Occasionally I will check my "spam" folder to see what has gotten caught, but in general it does a good job. If Gmail spam filters were categorizing important social emails as spam, then surely it was beginning of the end.

A few months ago, I would have filed this as another piece of evidence that the Robot Apocalypse is not coming anytime soon. When Reddit founder Alexis Ohanian interviewed a couple of computer scientist friends and me for the Upvoted podcast, I had been surprised that he asked how afraid we should be about the Robot Apocalyse. We had all laughed and said that the current state of artificial intelligence is not sufficiently sophisticated to produce robots who will take over the world.

What I have been coming to realize, however, is that unsophisticated robots have already taken over the world. Tethered to our emails, we are at the mercy of the less glamorous, but no less scary, spambots and spam filters. On the way to apple-picking, my friends and I wondered whether it is possible to prevent someone from ever having their emails received again if everyone collectively spam-filtered them. It turns out this depends on the sophistication spam filtering algorithms. It should be terrifying that this is possible--and that this can seriously compromise someone's standard of living.

Fortunately, there are measures that prevent the current robot situation from being more apocalyptic. We more or less trust Google to live up to their promise to "do no evil." We have some degree of trust that if technology creators abused their power, regulators would step in and protect us. And, importantly, we still live in a culture where we give people the benefit of the doubt when technology seems to fail. Behind most important decisions there remains a human to make the final call.

When things become too dangerous is when we begin to trust the algorithms too much. In The Fires, Joe Flood describes how a liberal city government caused New York City's poorest neighborhoods to burn down in the 1970s. The well-meaning government placed trusted the algorithms of RAND corporation to fairly allocate resources. What ended up happening was that, in the poorest neighborhoods, infrastructure was not well maintained and insufficient firefighting resources were allocated. Buildings became prone to fire and firefighters were slow to respond. These algorithms, like humans, were biased. Because the government trusted so much in the algorithms, however, there was too little oversight for too long.

The way to prevent the full Spam Filter Apocalypse is to avoid giving the robots too much power. As consumers, we have the responsibility to educate ourselves about what our technology is doing, think critically about how it could affect our lives, and push back when algorithms are doing too much without oversight. Protecting ourselves is as much a social engineering problem as it is a technical one. It involves educating ourselves enough that, as a society we can establish policies, both informal "best practices" kind and ones that are legally enforced. A first step is to stop regarding the Robot Apocalypse as a nebulous inevitability and to start seeing it as something that is already happening, but whose trajectory we can control.

As software comes to run our lives, the Robot Apocalypse we should fear is not the one that comes about because the technology become too advanced. We should instead worry about what happens when we place too much trust in technology that is not quite ready for the task at hand. The Spam Filter Apocalypse is perhaps less glamorous than what the futurists of times past may have hoped, but it certainly is no less scary.

1 comment:

Chris said...

Awesome post Jean - I came up with about seventeen things I wanted to say but upon rereading saw you had actually already said them. Mansplaining averted!

The one thing I do want to add is that I think this current situation has a lot to do with the customer/service provider relationship. Gmail has 900M+ MAU, there's no way you'll be able to seek satisfaction in that system - no matter how hard Google may try to let you.

I think a lot of this is tied back to things that Bruce Schneier has said about security "serfdom" - we entrust our lords to take better care of us than we could ourselves, and economies of scale mean that they can! But we aren't meaningful stakeholders anymore. Even worse, we aren't even paying customers. This is a great deal when the services are "super useful!!!" (think: email in 1994, heck, even 2004 when Gmail was introduced), but we need to seriously rethink it for services that are "basically required for modern first-world life!!!" (email in 2015).

The thing is that doing a good job of spam filtering is IMMENSELY helped by the network effect/economies of scale. The more email Gmail sees, the better their spam filters are. Period. You could probably dump a cool billion dollars into anti-spam and user acquisition, and still not do as well as they can. I think at a societal level this is a serious problem for email right now. It's very hard to provide an email service that is both responsive to the users/stakeholders and big enough to do an effective job at filtering the garbage.

In the new wave of popular freemium Internet service business models (Github, Dropbox, etc), the free user is at least a potential customer. The decoupling of Google's services from their customers has given us a renaissance of AMAZING free software, but may end up biting us in the tail sooner or (hopefully) later.