You may have heard of "information flow" as a term that has been thrown around with terms like "data breach," "information leak," and "1337 hax0r." You may not be aware that information flow is a specific term, referring to the practice of tracking sensitive data as it flows through a program. While techniques like access control and encryption protect individual pieces of data (for instance, as they leave a database), information flow techniques additionally protect the results of any computations on sensitive data.
Information flow bugs are usually not the kinds of glamorous bugs that make headlines. Many of the data leaks that have been in the public consciousness, for instance the Target and Sony hacks, happened because the data was not protected properly at all. In these cases, having the appropriate access checks, or encrypting the data, should do the trick. But "why we need to protect data more better" is harder to explain. Up through my PhD thesis defense, I had such a difficult time finding headlines that were actually information flow bugs that I resorted to general software security motivations (cars! skateboards! rifles!) instead.
|From the article.|
So what happened here? Instagram promises to protect secret accounts, which it (sometimes*) does. When one directly views the Instagram page of a protected user, they cannot access that person's photos, who that user is following, and who follows that user. This might lead a person to think that all of this information is protected all of the time. Wrong! It turns out the protected account information is visible to algorithms that suggest other users to follow, a feature that becomes--incorrectly--visible to all viewers once a follow is requested, because, presumably, whoever implemented this functionality forgot an access check. In this case the leak is particularly insidious because while the profile photos and names of the users shown are all already public, they are likely shown as a result of a computations on secret information: Brien Comey's protected follow information. (This is a subtle case to remember to check!) In information flow nomenclature, this is called an implicit flow. When someone is involved in a lot of Instagram activity, the implicit flow of the follow information may not be so apparent. But when many of the recommended follows are Comey family members, many of them who use their actual names, this leak becomes more serious!
|Creepy Facebook search, from express.co.uk.|
Until recently, my explanations have seemed subtle and abstract to most, in direct contrast to the sexy flashy security work that people imagine after watching Hackers or reading Crypto. By now, though, we information flow researchers should have your attention. We have all kinds of computations over all kinds of data going to all kinds of people, and nobody has any clue what is going on in the code. Even though digital security should be one of the main concerns of the FBI, Comey is not able to avoid the problems that arise from the mess of policy spaghetti that is modern code.
Fortunately, information flow researchers have been working for years on preventing precisely this kind of Comey leak**. In fashionable BuzzFeed style, I will list exactly five research ideas Instagram could adapt to prevent such leaks in the future:
* A student in my spring software security course (basically, programming languages applied to security), Scott, had noticed earlier this semester that emails from Instagram allowed previews of protected accounts he was not following. He reported this to Facebook's bug bounty program and made $1000. I told him to please write in the course reviews that the course helped him make money.
** Note that a lot of other things are going on in this Comey story. The reporter used facts about Comey to figure out the setup, and also some clever inference. But this clever inference exploited a specific information leak from the secret follows list to the recommendations list, and this post focuses on this kind of leak.