Archive for May 26th, 2011

May 26, 2011

The Filter Funnel

There was a Ted Talk by Eli Pariser that’s been getting a lot of buzz called the “Filter Bubble”. The concern is that with an emphasis of personalization on the web (personalized search results on Google, personalized news feeds on Facebook, etc) that we as individuals only see some small part of the web. As a side effect of personalization, we are only exposed to things that are familiar and pleasing to us. The concern is if Google learns I lean left or right, it only serves me content that confirms my beliefs. We never are exposed to contrary ideas. In effect, we can’t learn.

I disagree with Pariser. I can’t speak to facebook, but as a researcher in personalize search I think Mr. Pariser misses the point. He makes three arguments: (1) The search engine “hides” or filters out information to the user; (2) The search engine has a duty to inform the user of all relevant information, including information the user may not want; (3) users are unaware that they are missing information.

The first point is just flat wrong. In a personalized approach, search results tend to be re-ranked – not hidden. In Mr. Pariser gives an example of two of his friends search results for the query “egypt” during the protests. He only shows the top 5 or 6 search results for two different users. If you go deeper into the search results, you’ll likely find the same results, just at different ranks. Results are not hidden, the user just may have to dig deeper to find them. Ont can argue that most users don’t look beyond the top couple of results, but that’s not the same thing as hidding results.

The query “Egypt” is a very ambiguous query. The search engine is trying to guess what the user is likely interested in (travel, demographics, politics, etc). The user who sent him one screen shot void of protest information may not care about politics in the slightest. But if he did want those search results, it’s simple enough to reformulate the query. Think of the search engine as a pizza parlor you frequent. When the guy behind the counter sees you, he may put the special you always order in the oven, even before you get to the counter. That doesn’t mean you _have_ to order that pizza, or that your doomed to the meat lovers delight for the rest of your life, it just means you have to specify that you want something other than the usual. The same thing is true with the search results. You don’t have to accept the first results as the only information available, and users often don’t. Reformulating queries to get better results is extremely common.

This brings us to the corner stone of his argument is point, that the duty of the search engine. Mr. Praiser tends to equate the search engine with news, saying there is a duty to provide full accounts of information to the user. Really it’s more likely a library, where the user uses the card catalog to find information he or she wants. The search engine is an aggregator of information, not a disseminator; the user seeks out information from the search engine, the search engine doesn’t push information to the user.

If the user is not receptive to new information, then trying to force it on the user is counter productive. Mr. Praiser argument makes the assumption that if the search engine provides a more complete picture, the user will accept all relevant information. Visiting cnn, msn, fox news and reading the comments, shows people are finding their way to content they disagree with, and can continue to turn a blind eye to inconvenient facts when presented with them. This is not a new phenomena. It’s a form of confirmation bias (accepting things as truth when fit your beliefs regardless of facts), and human nature to discount things that don’t match our view of the world. Thus the “Filter Bubble” is more of a “Belief Bubble”.

This brings me to my final point, and the reason why I called this post the “Filter Funnel” – Personalization has been shown to increase serendipity in search (the discovery of new things.) I often see this in my own research, as well as published works by others. Often when an individual searches, they only consider the top search results. Going back to the Egypt query, perhaps I only care about travel, and not about politics. Maybe I am only looking for the airport in Egypt. By re-ranking search results that are likely not of interest to me, I see more search results that may be of interest within each snapshop. More relevant results tend to lead to the discovery of more information that is useful and fits my need. Perhaps the first result has the name of the airport, but the next three have valuable information about travel to Egypt that I wouldn’t have thought to search for. I now have more information of value to me than I would have had the results not been personalized.

This isn’t to say that over personalization isn’t a problem. I don’t like the news filtering on Facebook for much the same reason as Mr. Praiser – Just because I don’t interact with statuses from a subset of my friends doesn’t mean I don’t enjoy reading them. That’s a failure of the algorithm to identify what is interesting, but that doesn’t mean filtering in general is wrong. After all, there are also a subset of statuses on facebook I am much happier to have never read.

We can’t expect the tool to replace our own due diligence. If the user doesn’t want a complete picture, or both sides of the argument, than no amount of “ought to”s for the search engine will change that. The search engine will always strive to help the user arrive at valuable information as fast as possible. We may disagree, however, on what information is valuable.