Wednesday, May 15, 2013

Facebook Graph Search as a Journalistic Tool

Facebook is becoming an ever-powerful stalking tool. The social network now associates users with not just profiles stating some facts about their status and interests, but also, among other things, with photographs, location check-ins, games, and "pages" corresponding to businesses, brands, and celebrities. Until recently, however, the only way to thoroughly stalk someone was to manually comb accessible photos, locations, and pages.

Then came Facebook Graph Search. Graph Search allows users to search Facebook data to which they have access. Not only is it now easy to find which friends, friend of friends, and friends of friends of friends live in each city in the world, but we can find out who is tagged in photos with whom, who has been to what location with whom, and what pages our friends are liking. Anything that gets posted can now be scrutinized by those with permissions to access.

Facebook has posited that this is potentially a useful tool for journalists. For the final project of the News and Participatory Media class I am taking at the MIT Media Lab, I decided to investigate this claim. As part of this endeavor, I wrote According to Graph Search: 36 Hours in Portland, Oregon, a travel piece written without much prior knowledge of the destination and researched solely using Facebook Graph Search.

In this post, I describe how Facebook could be useful for writing about lifestyle and recreation. I also discuss why Facebook and Graph Search may need to undergo some changes for it to be useful for topics with more political and policy implications.

What can I do with Facebook Graph Search?
Facebook Graph Search allows users to programmatically access information about data that other users have posted socially. The interface currently allows users to search photos, tagged people and locations, and pages that users have "liked." Example searches include:
  • Games that my friend play.
  • Restaurants in Cambridge, Massachusetts that people I work with visited.
  • Photos of Democrats and Republicans.
  • Friends of my friends of friends who live in Cambridge, Massachusetts.
This search capabilities allows users to search the activities of a specific person or the people associated with specific photos, locations, etc. Graph search allows me to discover that my friend Jane has been at the Grand Canyon either by searching Jane or the Grand Canyon.

Graph search currently only supports search over explicitly labeled data: user tags, location tags, and page tags. Thus it does not support search over status updates and "likes" of status updates. This is not a fundamental limitation of graph search: being able to search the contents of posts would require graph search to work on a much larger scale. Adding hashtags would allow users to be able to search based on dynamically generated labels. Supporting search over the content of posts would require crawling over post data in real time. This is at the cutting edge of search technology and probably more than we can expect from Facebook at this point.

There have been some concerns about Graph Search and privacy. Privacy by obscurity is no longer possible: users can no longer hide behind how difficult it is for other users to access information. Graph Search guarantees that searches will only search over what a user is allowed to see. Users are now allowed to see more information than was previously convenient to browse. For instance, a user can "hide" a photo in which they are tagged from their timeline, but it is possible for Graph Search to make this photo available to those with permissions to see the tags. Previously, a user would have had to go to the profile of the user posting the photo in order to discover it. While these problems are not fundamental privacy issues, we increasingly rely on Facebook to support and correctly enforce palatable policies on what data other users can search.

How useful is Graph Search for journalism right now?
Facebook Graph Search allows people to easily search people who may have useful information about an event or location. For instance, "People who have been to the Boston Marathon Finish Line." These searches can also be narrowed: "People I work with who have been to the Boston Marathon Finish Line." There is also potentially useful timestamp information, as users can see when a photograph or check-in happened. It is also helpful that Facebook users tend to list some basic demographic information publicly, for instance name, gender, and current city. Many users also publicly list educational and/or work information.

Graph Search provides a nice first pass for writing about locations and people. Facebook has "pages" that are essentially profiles for celebrities, businesses, brands, and other non-persons. Pages can be tagged in photographs and check-ins. Pages report the number of users that "like" the page, that have "checked in" to the location corresponding to the page, and that are currently "talking about" the page--either through a check-in, on the page's "timeline," or in a tag. Facebook also supports star rating and reviews for pages corresponding to businesses. These reviews tend to be much shorter than Yelp reviews, making it possible to get a wider range of opinions. The flip side of this democratization is that there is less filtering.

There are some open issues with using Graph Search for journalism. One obstacle is verification: determining the veracity of tags and identities. It also remains unclear what barriers privacy settings might impose on using Graph Search for journalistic purposes. Journalists using Graph Search must be careful to account for the fact that the cross-section of location, photo, and "like" data is skewed based both on who is sharing information: people in their social networks and also people sharing information publicly. It seems, however, that there is always such a bias in journalism.

Facebook and Graph Search are currently not well-suited for reporting on topics outside of lifestyle and recreation. For breaking news, people do not seem to post as much news on Facebook as on Twitter and there is currently no way to search by topic. Ever since Facebook allowed "public" posts, there is more information publicly available, so these seem to be more incidental issues. A major issue that remains, however, is the linking of user identity with posts. It is one thing to associate restaurant reviews with your identity, but for many, especially those in countries with more restrictive governments, it is another thing entirely to associate your identity with political opinions. Because it is more difficult to be anonymous on Facebook, Facebook will need to implement additional mechanisms--and ones that people trust--to feel they can share "serious" opinions with relative impunity. What prevents people from posting more political opinion outside of small trusted circles is a more fundamental issue.

On using Graph Search to write a travel piece.
As Facebook and Graph Search currently seem to be best suited for lifestyle pieces, I wrote According to Graph Search: 36 Hours in Portland, Oregon using primarily Graph Search to evaluate it as a journalistic tool. The only other source of information I used was Google Maps for learning about relative locations. I proceeded as follows.
  1. I looked at the results of the query "Photos taken in Portland, Oregon" to see what looked interesting.
  2. I looked at the pages associated with the locations tagged in the photos to learn more.
  3. I queried for restaurants, bars, night clubs, and coffee shops in Portland, Oregon to fill out the rest of the weekend. In selecting which activities to include, I looked at the average star rating assigned by Facebook users, how many "likes" and check-ins the place had, and the general sense I got from reading the description, wall posts, and reviews.
  4. To help order the activities in a sensible manner, I used Google Maps to plot the locations.
  5. To flesh out the article, I returned to the pages corresponding to the businesses and look at the wall posts and reviews for quotes. I also looked at the
  6. Whenever I used a quote, I looked up the profile of the user associated with it to see what other information I could find. Most users had their current city publicly listed. I did not go further to verify these details, but I suspect that reporters do not go much further in cases like this, where identities do not matter as much. 
I was surprised to produce weekend plan that I would be happy following. I was also surprised that so much information about people's opinions of businesses was available to me essentially publicly: I only came across one post from a Facebook "friend" and I did not end up using this information.While it would have been interesting to see what friends post, for journalistic purposes it seems better to search based on the less social aspects of Facebook data. An additional note is that while people made their opinions public, their profiles remained relatively private: I was not even able to see the current city of some of the users.

In comparing to the New York Times's version of 36 Hours in Portland, Oregon, I learned that I had the right idea with many of the activities (beer; karaoke; antiques; nature) but only overlapped with the Times writer Freda Moon on one activity, the Japanese Gardens. Since the "36 Hours" activities are fairly specific and a weekend is a short period of time, the lack of overlap is not surprising. The activities proposed by Moon are arguably more hip and sophisticated, perhaps reflecting the difference between the intended Times audience and the cross-section of population making public posts on Facebook. The curation of the Times is useful: there is less quality control when crowd-sourcing travel advice to Facebook.

In its current state, Facebook Graph Search is better-suited for writing a travel article than Twitter but not obviously better than Yelp or Google Maps/Places. Facebook has an obvious advantage over Twitter because it contains more information about users and links data from users with data about locations and businesses. At present, Facebook's advantage over Yelp or Google is that it lowers the barriers for Internet users to post opinions, thus decreasing the selection bias. Because of the relative youth of these features in Facebook, however, Yelp and Google currently have more reviews.

In the future, I could see Facebook surpassing Yelp or Google to provide more relevant personalized recommendations. It is incredibly powerful to have a system that associates user identities with demographic information with other activities such as "like" and check-in information. In the future, it will be easy to find people according to what they like and where they have been. It will also be easy to figure out what is popular among whom. If we could look at how star rating change based on different interests, locations, and other likes, we can get a precise idea of who is interested in what. This could make it possible to algorithmically generate travel suggestions tailored to the interests of each individual traveler.

A Future with Graph Search
As Graph Search matures, it is exciting to see how it enables journalists to write about people's preferences and opinions. I am curious to see how adding features like search through posts could make it Facebook more useful for reporting breaking news and public opinion on topics outside of lifestyle and recreation. In order to make people feel comfortable posting "serious" opinions on Facebook, however, Facebook will need to think about how to protect people's identities while allowing them to share information credibly.

Despite my affiliation with Facebook as a Facebook Fellow and former intern, the views I express in this post are 1) my own and 2) based solely on publicly available information.

No comments: