Saturday, April 01, 2017

Techniques for Protecting Comey's Twitter: A Taxonomy

Person in the know calling me out.
After my post about how the Comey Twitter leak was the most exciting thing ever for information flow security researchers, I had some conversations with people wanting to know how to tell between information that is directly leaked and information that is deduced. Someone also pointed out that I didn't mention differential privacy, a kind of statistical privacy that talks about how much information an observer can infer. It's true: there are many mechanisms for protecting sensitive information, and I focused on a particular one, both because it was the relevant one and because it's what I work on. :)

Since this Comey Twitter leak is such a nice example, I'm going to provide more context by revisiting a taxonomy I used in my spring software security course, adding statistical privacy to the list. (Last time I had to use a much less exciting example, about my mother spying on my browser cookies.)

  • Access control mechanisms resolve permissions on individual pieces of data, independently of a program that uses the data. An access control policy could say, for instance, that only Comey's followers could see who he is following. You can use access control policies to check data as it's leaving a database, or anywhere in the code. Things people care about with respect to access control is that the access control language can express the desired policies while providing provable guarantees that policies won't accidentally grant access, and can be checked reasonably efficiently.
  • Information flow mechanisms check the interaction of sensitive data with the rest of the program. In the case of this Comey leak, access control policies were in place some of the time. For example, if you went to Comey's profile page, you couldn't see who he was following. How the journalist ended up finding his page was by looking at the other users suggested by the recommendation algorithm after requesting to follow hypothesized-Comey. (This was aided by the fact that Comey is following few people, and  In this case, it seems that Instagram was feeding secret follow information into the recommendation algorithm and not realizing that the results could leak follow information. An information flow mechanism would make sure that any computation based on secret follow information could not make its way into the output from a recommendation algorithm. If the follow list is secret, then so is the length of that list, people followed by people on the follow list, photos of people from the list, etc.
  • Statistical privacy mechanisms protect prevent aggregate computations from revealing too much information about individual sensitive values. For instance, you might want to develop a machine learning algorithm that uses medical patient record information to do automated diagnosis given symptoms. It's clear that individual patient record information needs to be kept secret--in fact, there are laws that require people to keep this secret. But there can be a lot of good if we can use sensitive patient information to help other patients. What we want, then, is to allow algorithms to use this data, but with a guarantee that an observer has a very low probability of tracing diagnoses back to individual patients. The most popular formulation of statistical privacy is differential privacy, a property over computations that allows computations only if observers can tell the original data apart from slightly different data with very low probability. Differential privacy is very hot right now: you may have read that Apple is starting to use this. It's also not a solved problem: my collaborator and co-instructor Matt Fredrikson has an interesting paper about the tension between differential privacy and social good, calling for a reformulation of statistical privacy to address the current flaws.
For those wondering why I didn't talk about encryption: encryption focuses on the orthogonal problem of putting a lock on an individual piece of data, where locks can have varying cost and varying strength. Encryption involves a different kind of math--and we also don't cover encryption in my spring course for this reason.

Another discussion I had on Twitter.
Discussion. Some people may wonder if the Comey Twitter leak is an information flow leak, or some other kind of leak. It is true that in many cases, this Instagram bug may not be so obvious because someone is following many people, and the recommendation algorithm has more to work with. I would argue that it squarely is in the purview of information flow mechanisms. If follow information is secret, then recommendation algorithms should not be able to compute using this data. (Here, it seems like what one means by "deducible" is "computed from," and that's an information flow property.) We're not in a situation where these recommendation engines are taking information from thousands of users and doing something important. It's very easy for information to leak here, and it's simply not worth the loss to privacy!

Poor, and in violation of our privacy settings.
Takeaways. We should stand up for ourselves when it comes to our data. Companies like Facebook are making recommendations based on private information all the time, and not only is it creepy, but it violates our privacy policies, and they can definitely do something about it. My student Scott recently made $1000 from Facebook's bug bounty program reporting that photos from protected accounts were showing up in keep-in-touch emails from Instagram. If principles alone don't provide enough motivation, maybe the $$ will incentivize you to call tech companies out when you encounter sloppy data privacy practices.

Friday, March 31, 2017

Five Research Ideas Instagram Could Have Used to Protect Comey's Secret Twitter

Even though cybersecurity is one of the hottest topics on the Internet, my specific area of research, information flow security, has remained relatively obscure. Until now, that is.

You may have heard of "information flow" as a term that has been thrown around with terms like "data breach," "information leak," and "1337 hax0r." You may not be aware that information flow is a specific term, referring to the practice of tracking sensitive data as it flows through a program. While techniques like access control and encryption protect individual pieces of data (for instance, as they leave a database), information flow techniques additionally protect the results of any computations on sensitive data.

Information flow bugs are usually not the kinds of glamorous bugs that make headlines. Many of the data leaks that have been in the public consciousness, for instance the Target and Sony hacks, happened because the data was not protected properly at all. In these cases, having the appropriate access checks, or encrypting the data, should do the trick. But "why we need to protect data more better" is harder to explain. Up through my PhD thesis defense, I had such a difficult time finding headlines that were actually information flow bugs that I resorted to general software security motivations (cars! skateboards! rifles!) instead.

From the article.
Then along came "This Is Almost Certainly James Comey's Twitter Account," an article I have been waiting for since I started working on information flow in 2010. The basic idea behind the article is this: a journalist named Ashley Feinberg wanted to find FBI director James Comey's secret Twitter account, and so started digging around the Internet. Feinberg was able to be successful within four hours due to being clever and a key information leak in Instagram: when you request to follow an Instagram account, it makes algorithmic suggestions based on who to follow. And in the case of this article, the algorithmic suggestions for Comey's son Brien included several family members, including James Comey's wife--and the account that Feinberg deduced to be James Comey's. And it seems that Comey uses the same "anonymous" handle on Instagram as he does on Twitter. And so Instagram's failure to protect Brien Comey's protected "following" list led to the discovery of James Comey's Twitter account.

So what happened here? Instagram promises to protect secret accounts, which it (sometimes*) does. When one directly views the Instagram page of a protected user, they cannot access that person's photos, who that user is following, and who follows that user. This might lead a person to think that all of this information is protected all of the time. Wrong! It turns out the protected account information is visible to algorithms that suggest other users to follow, a feature that becomes--incorrectly--visible to all viewers once a follow is requested, because, presumably, whoever implemented this functionality forgot an access check. In this case the leak is particularly insidious because while the profile photos and names of the users shown are all already public, they are likely shown as a result of a computations on secret information: Brien Comey's protected follow information. (This is a subtle case to remember to check!) In information flow nomenclature, this is called an implicit flow. When someone is involved in a lot of Instagram activity, the implicit flow of the follow information may not be so apparent. But when many of the recommended follows are Comey family members, many of them who use their actual names, this leak becomes more serious!

Creepy Facebook search, from express.co.uk.
In the world of information flow, this article is a Big Deal because it so perfectly illustrates why information flow analyses are useful. For years, I had been jumping up and down and waving my arms (see here and here, for instance) about why we need to check data in more places than the point where it leaves the database. Applications aren't just showing sensitive values directly anymore, but the results of all kinds of computations on those values! (In this case it was a recommendations algorithm.) We don't always know where sensitive data is eventually going! (As was the case when Brien Comey's protected "following" list was handed over to the algorithm.) Policies might depend on sensitive data! We may even compute where sensitive data is going based on other sensitive data! In a world where we can search over anything, no data is safe!

Until recently, my explanations have seemed subtle and abstract to most, in direct contrast to the sexy flashy security work that people imagine after watching Hackers or reading Crypto. By now, though, we information flow researchers should have your attention. We have all kinds of computations over all kinds of data going to all kinds of people, and nobody has any clue what is going on in the code. Even though digital security should be one of the main concerns of the FBI, Comey is not able to avoid the problems that arise from the mess of policy spaghetti that is modern code.

Fortunately, information flow researchers have been working for years on preventing precisely this kind of Comey leak**. In fashionable BuzzFeed style, I will list exactly five research ideas Instagram could adapt to prevent such leaks in the future:
  1. Static label-based type-checking. In most programming languages, program values have types. Type usually tell you simple things like whether something is a number or a list, but they can be arbitrarily fancy. Types may be checked at compile time, before the program runs, or at run time, while the program is running. There has been a line of work on static (compile time) label-based information flow type systems (starting with Jif for Java, with a survey paper here describing more of this work) that allows programmers to label data values with security levels (for instance, secret or not) as types, and that propagate the type of a program that makes sure sensitive information does not flow places that are less sensitive. These type systems give guarantees about any program that runs. The beauty of these type systems is that while they look simple, they are clever enough to be able to capture the kind of implicit flow that we saw with algorithms leaking Brien Comey's follow information. (We'd label the follow lists as sensitive, and then any values computed from them couldn't be leaked!)
  2. Static verification. Label-based type-checking is a light-weight way of proving the correctness of programs according to some logical specification. There are also heavier-weight ways of doing it, using systems that translate programs automatically into logical representations, and check them against the specification. Various directions of work using refinement types, super fancy types that depend on program values could be used for information flow. An example of a refinement type is {int x | canSee(alice, x)}, the type of a value that exists as an integer x that can only exist if user "alice" is allowed to see it according to the "canSee" function/predicate) Researchers have also demonstrated ways of proving information flow properties in systems like IronClad and mKertiKOS. These efforts are pretty hardcore and require a lot of programmer effort, but they allow people to prove all sorts of boutique guarantees on boutique systems (as opposed to the generic type system guarantees using the subset of a language that is supported).
  3. Label-based dynamic information flow tracking. Static label-based type-checking, while useful, often requires the programmer to put labels all over programs. Systems such as HiStar, Flume (the specific motivation of which was the OKCupid web server), and Hails allow labeling of data in a way similar to static label-based type systems, but track the flow of information dynamically, while the program is running. The run-time tracking, while it makes it so that programmers don't have to put labels everywhere, comes at a cost. First, it introduces performance slowdowns. Second, we can't know if a program is going to give us some kind of "access denied" error before it runs, so there could be accesses denied all over the place. Many of these systems handle these problems by doing things at the process level: if there is an unintended leak anywhere in the process, the whole process aborts. (Those who haven't heard of processes can think of the process as encapsulating a whole big task, rather than an individual action, like doing a single arithmetic operation.)
  4. Secure multi-execution. Secure multi-execution is a nice trick for running black-box code (code that you don't want to--or can't--change) in a way that is secure with respect to information flow. The trick is this: every time you reach a sensitive value, you execute the sensitive value in one process, and you spawn another process using a secure default input. The process separation guarantees that sensitive values won't leak into the process containing the default value, so you know you should always be allowed to show the result of that one. As you might guess, secure multi-execution can slow down the program quite a bit, as it needs to spawn a new process every time it sees a sensitive value. To mitigate this, my collaborators Tom Austin and Cormac Flanagan developed a faceted execution semantics for programs that lets you execute a program on multiple values at the same time, with all of the security guarantees of secure multi-execution.
  5. Policy-agnostic programming. While all of these other approaches can prevent sensitive values from leaking information, if we want programs to run most of the time, somebody needs to make sure that programs are written not to leak information in the first place. It turns out this is pretty difficult, so I have been working on programming model that factors information flow policies out of the rest of the program. (If I'm going to write a whole essay about information flow, of course I'm going to write about my own research too!) Instead of having to implement information flow policies as checks across the program, where any missing check can lead to a bug, type error, or runtime "access denied," programmers can now specify each policy once, associated with the data, along with a default value, and rely on the language runtime and/or compiler to make the code execute according to the policies. In the policy-agnostic system, the programmer can say that Brien Comey's follows should only be visible to followers, and the machine becomes responsible for making sure this policy is enforced everywhere, including the code implementing the recommendations algorithm. That policies can depend on sensitive values, that sensitive values may be shown to viewers whose identities are computed from sensitive values, and that enforcing policies usually implementing access checks across the code are all challenges. Our semantics for the Jeeves programming language (paper here) addresses all of these issues using a dynamic faceted execution approach, and we have also extended this programming model to handle applications with a SQL database backend (paper here). We are also working on a static type-driven repair approach (draft here).
I don't know how much this Twitter account leak upset the Comeys, but reading this article was pretty much the most exciting thing that I have ever done. Up until now, most people have thought about security in terms of protecting individual data items, rather than in terms of a complex and subtle interaction with the programs that use them. This has started to change in the last few years as people have been realizing just how much of our data is online, and just how unreliable the code is that we trust with this data. I hope that this Comey leak will cause even more people to realize how important it is to reason deeply about what our software is doing. (And to fund and use our research. :))

* A student in my spring software security course (basically, programming languages applied to security), Scott, had noticed earlier this semester that emails from Instagram allowed previews of protected accounts he was not following. He reported this to Facebook's bug bounty program and made $1000. I told him to please write in the course reviews that the course helped him make money.
** Note that a lot of other things are going on in this Comey story. The reporter used facts about Comey to figure out the setup, and also some clever inference. But this clever inference exploited a specific information leak from the secret follows list to the recommendations list, and this post focuses on this kind of leak.

Wednesday, March 08, 2017

Autoresponse: Striking: A Day Without a Woman

In front of the Federal Building, Pittsburgh.
Dear Message Sender,

  I am not responding to email on March 8, 2017 because I am observing A Day Without a Woman. In the afternoon, I will be joining students at CMU in a silent protest and attending a rally at the City-County building in downtown Pittsburgh.

  Despite the efforts and progress made towards gender equality, women do not have an equal voice, and we are not appreciated equally in society. For example:
  • The gender wage gap persists, and two-thirds of minimum wage earners are women.
  • The House and Senate are currently 19% women. This means an 81% male group is making decisions that affect women's health and lives.
  • The United States still has not had a female president, even though many countries we'd like to think we are more progressive than have a woman currently in power
  • Only 24 of the Fortune 500 CEOs are women. Money is power, and women have less of it.
Some may say that women are simply less ambitious, or don't want to be in positions of power and influence as much as men do. Study after study--and I'm happy to talk in more detail--have shown that women who do have the ambition face far more obstacles than men do. Also, my statistics above focus on what people like to call "privileged" women, but the undervaluing of female labor (including domestic and emotional labor) make life even harder for those in less fortunate circumstances.

  There are many ways you can show support. The first is to attend local rallies, especially if you have an employment situation where you will have few consequences. Even if you are not a woman and/or not striking today, here are some things you can do:
  • Listen to women, and call people out when women's voices are not heard.
  • Question your own biases. (You can have biases even if you are a woman!)
  • Vote for women. Champion women. Mentor women. (In that order.)
  • Support people who are striking, and who are more actively fighting for women's rights and the appreciation of women's labor, both financially and by amplifying their voices.

Yours in solidarity,
Jean

Friday, December 23, 2016

Let's Talk About How We Talk About Science

A while ago, Brian Burg commented on Twitter that he would like to see more discussion of marketing in academia. I decided I'd rather write a meta-post about how we need to talk about how marketing is affecting our evaluation of science.

Image result for beyonce magazine cover
Beyonce.
Image result for kim kardashian magazine cover
Kim.
If you want to be on the cover of Glamour magazine, you know what to do. Put your hair in glamorous waves, wear something small, and stare directly at the camera with slightly open lips. It helps if you have the Look. (Has anyone else noticed that Beyonce and Kim are being airbrushed to look more and more like each other all the time?)

If you want to be on the cover of a glamour journal, things are not much different. Open with a deep-sounding but incontestable vision of where you think the world is going. Hone in on a specific problem. Make the problem sound hard. Make your solution easy for a casual reader to understand. Write with the voice of a winner. It helps to have picked a topic that a science journalist might drool over. Oh, and if you are going for the cover: make sure to have good images.

But, you might say, fashion magazines are frivolous, and science is Serious*. I'll be the first to agree that the investigation of the fundamental truths of reality is a worthy endeavor requiring brilliance, hard work, persistence, and all kinds of other positive qualities. (Side note: beauty is also hard work, and used to oppress women.) But people determine what science is higher-profile than other science. People live in society, and it is widely acknowledged that society is superficial. Many a fairy tale involves a causal relationship between the changing of clothes and the changing of fortune. In Thomas Carlyle's satirical novel Sartor Resartus, religion itself is a matter of clothing.

In fact, a major part of my metamorphosis into a Real Researcher has involved accepting that appearance matters. When my advisor and I used to get papers rejected in the beginning of my PhD, we would spend a long time thinking about how to make the work so good that the paper was not rejectable. I have come to realize that this is the equivalent of failing to impress on a first date and hoping that soul-searching will address the issue for the future. Looking deeply into one's soul, while usually good in the long term, often does not address the problem of first impressions.

Sure, part of preparing one's research for wider dissemination involves doing what everyone would expect of good communication: having a clear description of the goals, clear explanations of the solutions, and a clear explanation of the context with respect to previous work. Good logical reasoning goes a long way. Good evaluation of results does as well. But if we look at the papers that do--and don't--make it into the "glamour" conferences and journals, we begin to suspect that there are other factors at play.

If we look more closely, we can see that American** science replicates patterns of elitism and gatekeeping that we see in the rest of American society. In Privilege: The Making of an Adolescent Elite, Columbia sociology professor Shamus Khan reports on behavioral traits that characterize the new elite. Khan describes how, rather than stemming from family prestige, the social status of the boarding school students he observes comes from an ease of moving through social situations and a cultural omnivorousness (embracing both the high-brow and the low-brow). Especially since these behaviors are learned at elite institutions, they serve a gate-keeping function similar to explicit markers of socioeconomic status. People look for this ease and this omnivorous, for instance when interviewing candidates, justifying their choice with some idea that such traits somehow make people more deserving. There is also a mythology about hard work that serves more as a justification than an explanation for elite status: students feel that they are receiving the benefits they do from society not because they were born into it, but because they "worked so hard to get there."

As it turns out, the training of elite scientists also involves learning gatekeeping behaviors. In science there is, a similar mythology about hard work being responsible for differential success. In Computer Science, the privileged behaviors I've observed include having research vision (as opposed to making solid technical contributions), being aggressive about imposing that research vision upon others, and having a "genius quality," which involves pattern-matching on similarities to previous successful scientists (often white men). Like ease of interaction and cultural omnivorousness, these traits are often associated with people deserving of recognition, but their presence does not mean the work will be good. I would not be surprised if having research vision and exhibiting genius quality were more correlated with being educated in an elite American institution than with potential for long-term scientific impact. With this premise, the recipe for academic fame involves not only marketing one's work as making positive contributions to science, but also demonstrating a combination of privilege and flash. The privilege here is more subtle than that of having cover-girl looks, but it is a very real kind of privilege nonetheless.

But how, you may wonder, do people not see through the shiny exterior? Those who have been following American politics in the last year may be familiar with the answer: insufficient attention. Publications are reviewed by researchers under increasingly high demands to pass quick judgments. Between December 2015 and February 2016, for instance, I had accidentally agreed to be on two concurrent major conference Program Committees, and had a reviewing load of over 60 full-length (12-page, 9-10 pt font) papers. (And I am not the only person who had such reviewing volume!) Had I only been on one Program Committee, the reviewing load would have still required me to evaluate, on average, a paper every two days over the course of two months. Under such reviewing pressure, it is easy to succumb to flash judgments, emotional first responses to a paper's Introduction section. It is easy to accept the paper with the good story over a paper with a deeper but more subtle result.

Despite all this, I believe in the future of science, and that we can shift back to a situation where we are making space for "real" science, what science looks like before the makeup and airbrushing. To do so, we need to wage a similar campaign to the one people waged on unreasonable beauty standards. We need to teach people to recognize--and be skeptical--of "Photoshopped" results: all that is too slick, too inspiring, and too good to be true, in both individual papers and in the story of a scientist's career. We need to raise more awareness about what "real" science looks like: the incremental results required on the way to big discoveries, the science that is foundational, necessary, and often with subtleties difficult to communicate to non-experts. Making structural changes that reduce reviewing loads and allow for deeper evaluation also reduces the incentives that have led to the proliferation of these current practices.

Elite institutions are much more than a finishing school for scientists, but we have been moving to a model where the marketing is coming to dominate the science. To protect the pursuit of truth, we need to admit that people can be shallow when it comes to evaluating science. We need to talk about how we talk about science so we can make space for science that is slow, science that is subtle, and science that is outside the mainstream.

With thanks to Seth Stephens-Davidowitz, who told me my first draft lacked a cohesive point, and Adeeti Ullal, who very patiently helped me with the last paragraph.

* I don't believe fashion magazines are blanket frivolous, but you might.
** I don't have the depth of experience to comment on how this generalizes to other cultures.

Saturday, December 17, 2016

The Structured Procrastination Trap

A wise professor once told me to take advice with a grain of salt, as it is mostly highlights and wishful thinking. Structured procrastination is a prime example of wishful thinking doled out to students eager to ease growing pains.

Structured procrastination promises a productive life with minimal pain. The basic premise is that if you always do something other than the task you are supposed to do, you will be able to always be doing something that you want. Don't want to write that report? Play ping-pong with your students instead, and people will be impressed with how easy you are taking life. Don't want to respond to emails? Read papers you like instead, and people will be surprised you make time to read papers. If you keep waiting, you will want to do that thing that you have been procrastinating, and then you can live a completely pain-free life!

Now let's look at the premises for structured procrastination. It requires that there is always a task that you can and want to do that is productive. It requires that deadlines make tasks miraculously desirable, and it is the fact that something is due soon that makes a task easier to do, more so than other factors (like how easily you are able to do the task). It requires that you have a good sense of how long tasks should take. For structured procrastination to make sense, you need to be at a point where life is simply a matter of execution.

In my many years of being alive, I have discovered that these premises often do not hold. When I was a graduate student and looking for shortcuts to the Productive Life, I felt like I was doing something wrong. When I aggressively tried to apply structured procrastination to my life, I produced a lot of bad work. There were long periods of time when I would try to get into immersive "flow states" where I could have pleasurable levels of focus, but everything felt difficult. I've spent a cumulative total of days, maybe weeks, of my life wondering why it takes me so long to write a paper, or to prepare a talk, or to debug my code. For years I thought that it was possible for life to always be easy, but I had somehow not figured out how to do it.

What I realized is that life is hard, and especially hard if you want to do things you have never done before. If you are doing something that requires you to grow, what you need is a lot of time, and the discipline to force yourself to keep doing something even when it feels like the most painful thing in the world. If you are doing a high growth activity, you need to abandon the idea of structured procrastination. You need set hours that you are going to sit down (or stand up, and lay down) and stare at your notebook, or laptop, or the wall, where you are dedicated to making progress on the Very Important Task. Limiting these hours makes it psychologically bearable. Making these hours the same time every day makes it more likely you can keep this.

Of course, structured procrastination is not all bad. I have recommended this technique to many people, as it is a great way to get oneself out of unproductive loops when a looming deadline kills all desire to do anything. If you allow yourself to admit that you are not going to work on your Very Important Task, then you can at least do "productive" things (like make Ryan Gosling memes) instead of sitting around angsting (which could also be productive according to some value systems). Procrastination is also a good way to trick yourself into doing more things, because deadlines often do make people more efficient.

While structured procrastination provides a useful execution framework, there are times in life when you need to suck it up and do the Very Important Task. In fact, structured procrastination may be most seductive when what you need most is structure. For this reason, you should always think before you procrastinate, and avoid the trap of false busyness.

Wednesday, December 14, 2016

What Professors Can Do About the Collaboration Problem

A few weeks ago, I wrote a blog post about the "collaboration problem" that sparked a significant amount of discussion among my colleagues in academic computer science, in large part because many people had observed the same problem, without ideas for great solutions. Here are some emails I've exchanged with Ben Zhao, a professor of Computer Science at UC Santa Barbara, and my colleague Claire Le Goues in the School of Computer Science at CMU about how to address the problem in the courses we teach. (Ben recently posted this article on social media and had quite extensive discussion with many people in the field about how to address the problem.

I hope this will generate even more discussion that brings us closer to solutions.

--

from:Ben Zhao
to:Justine Sherry,
Jean Yang
date:Tue, Dec 13, 2016 at 1:50 PM
subject:looking for advice

hey Justine, Jean.

Random email out of nowhere, hope you’re both doing great, and happy holidays!! :)

So I’ve been thinking and reading a fair bit on group dynamics in CS classes, esp. w.r.t. female students, with a fair bit coming from you guys. There aren’t that many in my classes (I teach undergrad networking and OS, so they’re almost all juniors/seniors by the time they make it to my class). So I’d like to make sure that I’m not contributing any more to the gender imbalance.

I need help. As strong women in CS, would love your take on a couple of key questions (but also would love any general advice you want to share, period).  And I know you’re busy super busy, but hopefully this is something that won’t take too much time. Either way, your advice would immediately impact 10s of female students in the coming quarter...

Key questions on my mind right now are:
- How should I do group assignments for larger classes with moderate to heavy projects? About half of my networking class homeworks are in groups of 2, and nearly all of my OS class homeworks/projects are in groups of 2-3.
- From what I have thought about and seen in past classes, I think my past practice of letting students choose their own groups doesn’t work. I recall something like 1/2 of all groups with at least 1 female student experiencing some type of malfunction, either due to the male student(s) flaking out or just failing out.

Right now, I’m considering something like the following:
  - Beginning of quarter, I reach out individually to all female students in the class (maybe 10-15, 20 if I’m lucky), and I ask them to attend an open discussion with me on campus.
  - I ask them for their experiences and concerns in the class, and esp. for group projects
  - I lay out what I think are challenges that they could face
  - I give them the option to find partners within the group, before the overall group formation process starts.

What do you think? Would this work? Would female students react negatively to being singled out? What happens if they don’t care and don’t show up?

Thanks in advance, and again, I’d love to hear any thoughts on this or on any other topic..

thank you thank you!
Ben

--

from:Jean Yang
to:Ben Zhao
cc:Justine Sherry,
Claire Le Goues
date:Wed, Dec 14, 2016 at 12:16 PM

Hi Ben,

  Thanks for writing! These are great questions, I'm glad you're asking them. I'm looping in Claire Le Goues, another professor at CMU, because we've been talking about how we could address some of these collaboration issues with curricular changes, and about proposing an audit of the curriculum to make sure students are learning collaboration skills.

  Here are some things I learned from people after my blog post about the collaboration problem:
- It's important to keep in mind that all students have trouble with collaboration. It may disproportionately affect female and other minority students because 1) there are already so many factors that wear away at their desire to participate, and 2) students without strong social ties within Computer Science may not have access to as desirable of a partner pool. But an important take-away is that all students struggle with learning how to collaborate well, that we don't teach it in lower-level courses, and that in upper-level courses collaboration ability becomes important for academic success all of the sudden.
- There is strong evidence that self-selected groups are not as good as instructor-assigned groups.
- There are many resources out there for helping students work in teams more effectively. I was given this as a starting point:
- There are ways to get students to more actively work on their collaboration skills. Claire addresses this in the software engineering course she teaches. Some professors have reported having students assess how collaborations went, and docking points for students who didn't collaborate well.
- Several women, including myself, said that their best collaborations during undergraduate were with other women. I'm still not sure what to make of this in the context of other findings.

  Given this, I have the following thoughts about your proposed plan:
- I like the idea of talking to students about collaboration issues, but there are two main reasons I wouldn't do it only with the women. First, collaboration is an everyone problem, and not just a women's problem. It also affects people along lines of race, sexual orientation, socioeconomic background, etc. Second, even if the problem were only one of gender, it's a problem to be addressed by people of both genders. I've long believed that in order to solve the gender problem, we need to address the stereotypes associated with both femininity and masculinity. Only involving one of the genders in the conversation places all of the burden on that gender, and when it's the women, we are burdening an already burdened group. For these reasons I'd encourage a discussion about collaboration with the entire class, and then support to ensure collaborations are going as smoothly as possible throughout the semester.
- It doesn't hurt to check in with women and minority students, but without making them feel like they are being singled out, or because you are interested in them primarily because of the women in CS problem. My undergraduate professors paid a lot of attention to me, and I always assumed it was because I was a woman, and in fact this made me feel like I was less deserving of attention.
- I do like the idea of making it easier for minority students to find each other, but I don't know that it's your place to do it as the instructor. I don't know if there's a non-awkward way to bring this up during the whole-group discussion. Also, based on what people say it actually seems better to assign the partners as the instructor, and then it would not seem appropriate to assign people to work together based on their minority status. I'm still really not sure how to think of partner choice vs. partner assignment, and welcome discussion about this!

  Curious to hear your thoughts after your Facebook post about this topic blew up. :)

Best,
Jean

P.S. This discussion is interesting. What do you think about me posting this to my blog, maybe after Claire/Justine chime in?

--

from:Ben Zhao
to:Jean Yang
cc:Justine Sherry,
Claire Le Goues
date:Wed, Dec 14, 2016 at 1:27 PM
hey Jean.

Great thoughts.

I’ve been learning a lot from the various viewpoints on the FB post, but I’m slightly frustrated by the lack of consensus as to the right solution. First, I agree with all the viewpoints that the problem is broader (re: your point on everyone having collaboration issues and others’ points about male students sharing in the solution), and any effort to address it should be more inclusive. That I think is very doable: I can talk about the issue early on in the class with some of Sarita’s slides she shared on the FB post. Hopefully I can do it in a tactful way that doesn’t alienate any group.

By beyond that, I’m sort of torn. It’s clear that different personalities play into how different women reacted to my suggestions. Some, like my senior colleague Linda Petzold, reacted fairly negatively because (I think at least in part) she has a really strong personality, and perhaps had less of an issue handling those situations herself. Perhaps I’m generalizing too much based on a sample set of maybe 2-3, but I’m guessing there might be an inverse relationship between a student’s own ability to deal with these challenging situations and their sensitivity to being singled out. In other words, is it possible the women (or other minority groups) who are most vulnerable to the negative situations (because they’re less assertive or more introverted) would be less concerned about being singled out as a group?  I don’t want to downplay comments from you or Linda (and a couple others on the FB thread) about being singled out, but do you think that then sensitivity might be less of an issue for less assertive students? Given my slightly biased sample of strong female colleagues, I’m not quite sure how far off I am on this line of thinking.

My overarching concern is that a broader discussion, while very positive and definitely much better than nothing, is still not quite enough. I worry that individual students will find it difficult to reach out to me the professor to discuss group issues. This has been very much my experience in the past, that students don’t want to appear like they’re a hassle, and no matter how I try to make myself approachable and less intimidating, there’s always a high barrier to overcome (especially for those more shy/introverted students). So all those comments/suggestions that involve groups reaching out and giving me feedback about their own individual group dynamics, I think they’re somewhat naively optimistic.

So I will definitely do what I can for the broader class. But I worry that won’t be enough. Beyond that, I can do random group assignments. But there I foresee lots of complaints by students unable to work with their friends, and any personality conflicts will be blamed on me (which is ok). There I worry that the disruption to the class group formation as a whole will produce more issues, and I haven’t convinced myself that random assignment is a better solution in general.  The other option is more proactively reaching out to female students. There the question is do the ends justify the means: would the potential benefits of helping women students form self-selected groups outweigh the initial discomfort of being “singled out”?

Any/all thoughts welcome, and thanks for spending your time on this. I’m fine with whatever you want to post on your blog about the issue. I think more exposure can only help, as I’m pretty sure that most (if not all) of my male colleagues in the dept have no idea group dynamics is even an issue.

thanks,
Ben

--

from:Claire Le Goues
to:Ben Zhao
cc:Jean Yang,
Justine Sherry
date:Wed, Dec 14, 2016 at 5:46 PM
I don't think that the following response is by any means complete, but here are some offhand thoughts:

I also wouldn't single out female students. I do think you can signal that you are a supportive ally in various subtle and not-so-subtle ways, especially at the start of class.  For example, giving the students a survey wherein you ask for their names, preferred names/nicknames, and pronouns indicates that you are a person who understands that pronouns are a thing worth asking/caring about.  This can indicate to marginalized students that you are more likely to be educated/aware of gender dynamics overall and thus they may feel more comfortable approaching you with concerns.  

My take is: Many women or other members of underrepresented groups *know* that life can be challenging as a woman in homework groups.  Having someone tell me those challenges neither solves them, nor makes me feel much better.  On the other hand, having you publicly talk to everyone, men included, about challenges that groups face, covering elements like subconscious bias, diversity/groupthink, etc, and the ways those forces hinder effective teamwork, might frankly resonate more with the women than singling them out, and might actually get the guys to think about their lives/privilege/behavior a little bit.

What you might do, if you don't necessarily want to go the random route, is ask *all* the students at the beginning (as part of a start of semester survey) if they have someone they want to work with.  You can say something like "I haven't decided yet how to assign groups but I am willing to entertain suggestions, so let me know by filling out this form; I will not share your answers with anyone."  If all the women pick someone reasonable, and all pairs are matched (like I say Jean and Jean says me), then you let them pick their own.   That way you're neither saying "HEY WOMEN YOU ARE BEING SINGLED OUT" but you can still get at the information you want.

We do assign students to groups pseudo-randomly, which is honestly pretty consistent with the literature.  We ask for schedule availability (there's an online tool for this I can dig up) and use that to assign groups, looking to maximize times they can work together while honestly trying, when we can, to split up known cliques.

This is all made easier by the fact that I teach a class explicitly about software engineering, including teamwork and process, and so we can very easily and truthfully say: You will go work for a company and get assigned to work with a team of people you do not know, and so being able to do that effectively is one of the learning goals of this class.  The students are generally receptive to this argument, even though there are always a dysfunctional team or two.  Teaching a systems-y class lends itself to the same line of argumentation: either they're going to industry or academia, and regardless, they need to learn to work with people who are not their friends.

The literature is mixed on group composition.  There is some evidence that putting members of underrepresented groups together is good in early classes (100-200 level), and additional evidence that past that, it doesn't really matter (because if you haven't dropped out yet, group composition is unlikely to be the deciding factor?).

Other thoughts on what we do: 
(a) We provide opportunities for individual assignments/assessments, ideally with each group assignment or milestone (so the first part is group work, and a smaller component is to be done individually).  This lets us identify malfunction and ensures that we are actually assessing individual as well as group performance, 
(b) We do not do peer grading by default but reserve the right to start if teams report serious problems.   Peer grading skews incentives within groups in a way that interferes with our particular learning goals in an SE context. However, it might work in your context where the "learn to work in teams" is less explicitly a goal of the course. 
(c) (speaking to your concern about students being hesitant to surface or discuss issues with you) At various points in the semester we specifically survey the groups about how they think they're doing and aggregate the feedback from everyone to send it back to them (these are teams of 3--5, so it's easier to do anonymously than teams of 2).  The form we used is based on literature on assessing group performance; I can find it if you're interested.  We then reach out to students who are reporting problems.  We talk to them individually and then also reach out to the whole group as appropriate (if the individual student says they're OK with it).  We do encourage them to try to sort it out by talking about it amongst themselves, and regardless, we follow up with the individual students to see how they feel after a week or so.  We do not use those feedback forms for grading in any way (and we tell them so).  I have found that about half the time, the frustrated students just want to vent for a half hour and then say "yeah I feel better, no need to do anything else." ;-)  When we bring them in as groups, we try to make them do as much of the talking as possible.  Like "Hey everyone, how's it going?  What do you think you're doing that's working well?  What are you doing that's not working well?  How can you fix it?"
(d) We cover effective teamwork in class at the start of the semester; practices like "have specific roles that you rotate; document agreements and assignment of responsibilities" (also from the literature---I think the Oakley article Jean linked).  I'd emphasize both the explicit assignment as well as the "rotation" aspect of the roles---otherwise, the female students tend to get "scribe" duties all the time. We've debated ways to enforce those practices (like asking for documentation of who does what), but have never formally done so.  

Honestly, I've never heard anyone say that having the students pick their groups worked out particularly well, frankly.  There are profs who say "Oh, but they complain if we assign them randomly!" And perhaps I'm too cavalier, but my response is: so?  They also complain if the tests are hard and if we give them too much homework, and we do it anyway because it serves our pedagogical goals.  I feel the same way about assigned groups.

Assigned groups don't solve all sources of team dysfunction, of course, and so I think we should as a curricular point do more to mitigate the risks that groupwork poses particularly to marginalized students and to teach students how to work together.  They spend their childhoods being told they're not allowed to work with others, and then we throw them into teamwork situations with no training, and then are surprised when they're terrible at it.  I think covering those challenges and strategies to mitigate them and proactively paying attention to team dynamics over the course of the semester is important to help them learn, though I by no means think we have a complete answer on that.

-CLG

Tuesday, November 29, 2016

Why We Need to Talk About the Collaboration Problem

Today I spoke with a Computer Science professor who is finishing a semester of teaching a notoriously challenging advanced undergraduate course.

"I figured out the problem with my female students," he told me. "It's their partners."

All semester, this colleague--let's call him Albus Dumbledore--had been telling me about the strange phenomenon of drama with his female students and their project partners. The course has a significant project component, and successful completion of the project usually depended on both partners pulling their weight. Mediating partner disputes became the responsibility of the instructor. And what the instructor noticed was that an alarming fraction of the disputes seemed to happen when one of the partners was female.

After wondering all semester how bias might contribute to the drama of the female students' partners, Albus had a relevation. The female students complaining about their partners all seemed to have better overall grades than their partners. Not only did the partners have lower GPAs, but many of them were from outside of Computer Science. Albus surmised that these partners were, in fact, probably not pulling their weight, and that the students had every right to complain.

"But why would these strong students choose such bad partners?" he asked.

That female students had bad partners was, to me, not surprising. After all, nobody had asked me to work on any problem set until the second semester of my sophomore year, and a fellow student only asked me after obtaining an unprotected copy of course grades on our department servers and discovering I had the second-highest midterm score in one of our courses. I told Albus about how a friend once confessed to me that before she had gotten to know me, she had forbade her boyfriend from working with me. I told him about how problem set partners often preferred to solve problems for me rather than with me. My best collaboration in college had been with another woman, and she had been so initially skeptical of my abilities that it took me at least half of a semester to win her over with how fast and how correct my code was.

"So it's not by choice," Albus concluded. "What can we do about this?"

Important question. For my first few years of college, the collaboration problem had left me feeling so isolated and so much in doubt of my abilities that I often thought about switching away from Computer Science. If not for a chance encounter with a friend, one year behind me and facing similar problems, I might have left. What began as a quick hello as our paths intersected on the way back from class turned into a long discussion about the difficulties we both had in finding people who would collaborate with us. I had graded this woman's homework in multiple classes, so I knew the problem was not that she was not capable. This was when I began to realize that the problem may not be with me, but with the way people perceived me--and other women.

Years later, when I was starting Graduate Women at MIT, this conversation led me to put together a panel on collaboration--specifically, on collaborating as women in male-dominated fields. I felt so validated when the panelists--three women at various stages in their careers, each at the top of her field--said what I had observed for years, but had never dared to say out loud. It can be hard to collaborate with men, one panelist said: they often talk at you rather than to you because they are socialized to impress women. It can be harder to collaborate with two men, another panelist said: they will often talk only to each other while trying to impress you. (I don't like to make blanket statements about all people of a gender, just like I don't like to make blanket statements of all people from a culture, but these kinds of conversations can be helpful for recognizing patterns.) While much of this advice was unsurprising, and also depressing, it felt incredibly powerful to hear someone else say these statements out loud. Talking about this explicitly seemed like the first step towards solving the problem.

In the intervening years, I've collected much more evidence of the problem than I have solutions. It is undeniable that collaborations account for much of people's success in technical settings. Albus talked about how, in his class, the students with subpar partners struggled to complete their projects. A recent study I read* cited female academics' ability to travel for international collaboration as one of the biggest determinants of their success. Yet collaboration seems to remain a problem. At a recent lunch of Women@SCS in my department, I spoke about my experiences with Graduate Women at MIT, including about the collaboration panel, and the student kept returning to the issue of collaborating in a male-dominated field. Students asked about how to find collaborators who would take them seriously. Students asked about what to do in groups when people may not be listening to them. A student asked what to do if she has had so many negative collaboration experiences she is reluctant to collaborate anymore. A student said that she, too, felt like male collaborators were often trying to impress her rather than work with her, but she had thought it was in her head.

After the recent lunch, a student asked me about the benefit of talking explicitly about these issues. Wouldn't it be better, she asked, to not draw attention to gender and wait for the problems to go away? I, too, would love to live in a post-gender world where people can just be people. Unfortunately, it seems that collaboration is a topic we need to address explicitly. Not only do these cross-gender/culture problems not seem to be going away on their own, but they also seem to be increasing certain inequalities. Especially in Computer Science, smart people have done an excellent job of solving many other problems of gender equality. I have full confidence that once we recognize this as a problem, we can find good solutions. I would love to hear your ideas.

* In the process of looking for this citation... Let me know if you have it!