Sunday, October 05, 2014

Technical Privilege Reading List

People have been asking about the books I mentioned during last Friday's Challenging Technical Privilege Symposium at MIT. (The symposium was fantastic. So many people came and asked great questions. The video should be up soon!) Below I list the books I mentioned, along with some other books for people interested in the topic. Enjoy. (These are heavy topics, but these books are well-written, engaging, and provide actionable solutions.)

Essential Reading
  • The Curse of the Good Girl (Rachel Simmons, 2010). About how we socialize girls to self-censor. An important read for parents and educators of girls, and also for everyone else to understand why women behave more conservatively than men. (The question of whether this is a "curse" is up for discussion, but it certainly holds women back in a man's world.)
  • Talking from 9 to 5: Women and Men at Work (Deborah Tannen, 2001). Georgetown sociolinguist talks about the "cultural" differences between the way men and women speak and how this affects workplace dynamics--and evaluations. She observes, for instance, that men aim to achieve dominance in conversations, while women aim to prevent their conversation partner from being subordinate. Men assert facts; women give compliments. Because of these dynamics, observers will agree that men "win" workplace conversations.
  • Unlocking the Clubhouse (Jane Margolis and Alan Fisher, 2003). Talks about the authors' research about why women tended to shy away from studying computer science and why women leave. They give concrete explanations for how women are socialized to have less interest: for instance, in families the computer will much more likely be in a son's room than a daughter's room. They also talk about phenomena such as how women cite poor performance as the reason they leave computer science, but in fact they are doing better than men who stay. (Curse of the Good Girl!)
  • Why So Slow? The Advancement of Women (Virginia Valian, 1999). I learned so much from this book! It is full of useful facts and explanations. Psychology professor Valian talks about studies that show there is bias against women: for instance, when people are shown resumes with women's names, the resumes need 1.5x the achievements to be assigned the same title. She talks about everything from work to clothing (how men have a uniform whereas all women are "marked") to physical traits (hypermasculinity is associated with competence, whereas hyperfemininity is associated with incompetence) to misconceptions about gender and emotional stability (people talk about women's monthly cycles, but men have both a daily and a yearly cycle). Understanding these gender biases and difference is an important first step towards improvement.

Other Reading

A related thing I also mentioned is the Representation Project's movie The Mask You Live In, coming out in 2015, about American constructions of masculinity and how it's limiting to men. I highly recommend watching the trailer!

Sunday, September 21, 2014

Some Cooking Updates from the Kitchen-Field

I've been making some slow and steady progress in expanding my cooking repertoire. I even got some plants to aid in cooking! Here are some reports from the kitchen-field (because there are plants now, see) covering the main advances. With thanks to Aliza Aufrichtig and Loris D'Antoni for their expertise and consultation.

Here are some things I've started keeping around my kitchen:
  • Ginger root. I use this with onion or garlic as a base flavoring in stir-fries. I also use it to make ginger lemon tea, which pretty much cures all ailments. (Lazy person's amendment to the recipe: just cut the ginger coarsely and cook it for a really long time instead of grating it.)
  • Dried shrimp. Great for various easy-ish Chinese gourd dishes, for instance bitter gourd and winter melon.
  • Sichuan peppercorns. These little numbing things are great for putting in stir-fries (at the beginning is when I do it), broths, and probably other things. You can use them whole or grind up them up to distribute the flavor.
  • Parsley. Okay, I was late to the party with this universal garnish. I've started getting it more and I even tried to have a parsley plant for a while. The plant didn't go well; Loris made me feel better about it by saying that parsley plants don't regenerate that much anyway. Loris told me about the trick of freezing parsley into ice trays so you have exactly the right amount to use later. (I am not that fancy; I freeze my parsley all together loosely in a container and then shake out a handful at a time.)
  • Basil. Late to the party on this one too obviously. I didn't start using it a lot until I tried having a basil plant, since I had trouble keeping fresh basil around. (This plant didn't go well either.) In preparation in harvesting all those leaves from my basil plant (that actually wilted before anticipated harvest), I bought a lot of basil from the store and practiced using it. I learned how to freeze it--a real innovation! I also invented a great snack: Greek yogurt, blueberries, honey, and BASIL.
  • Thyme. I only recently learned how to cook with thyme, but it seems to go great with tomato pasta sauces and meats. Loris tells me thyme is a good herb to have fresh in my kitchen, so I recently acquired a thyme plant. Fingers crossed that this plant lasts longer.
  • Turmeric, coriander, cumin, and various other Indian-related spices. (I got a masala dabba to hold them!) They're useful for Indian recipes, obviously. I've also started playing around with putting these spices into stir-fries in small amounts.
  • Thai fish sauce. This one is pretty pungent, but I've started using it to put on noodles (along with sesame oil) and also to flavor stir-fries. It look me a couple of days maybe to get used to the taste and smell, but I really like it now.
I've also started keeping around Sauvignon blanc or Gruner Veltiner (white wines) for the purpose of cooking pasta sauce. I've tried using it with vegetables as well (as well as capers with vegetables), but I haven't entirely gotten the hang of that yet.

Here are some favorite recipes that have proven to be easier than I anticipated:
Here are some snacks and other things I've recently invented:
  • Well, that snack with Greek yogurt, blueberries, honey, and basil.
  • This other snack that involves dicing up a pear or nectarine, heating it with maple syrup, and then adding a couple spoonfuls of Greek yogurt. Optional oatmeal makes it more cobbler-like.
  • This sauce for baked salmon with chopped up capers, parsley, basil, vinegar, and olive oil. Salt and pepper the salmon to your liking, bake at 350 degrees for 15ish minutes, and then put the sauce on.
Also food-related: coconut oil is great for maintaining bamboo cutting boards and sesame oil is a great hair and body moisturizer. Who knew? Jump on; it's trendy to use oils for everything now.

Thursday, September 18, 2014

Experiment: Daily GitHub Checkins

I've been doing a lot of relatively mindless but decently labor-intensive code-related work (colloquially known, especially in the brogrammer community, as "coding bitch work"). I've been building up some web-based case studies in my Jeeves programming language. I've also had to take over some student code. Taking over this code was particularly painful because of all the managerial regret I felt: regret about not having made them document better, about not having made them do more work. The takeover process has involved a lot of commenting, test-writing, and the occasional small extension to test that I really know What's Going On.

Anyway, to try to mitigate the pain of these various tasks, or to spread it out and prolong it, I've decided to break from my usual model of nothing-nothing-nothing-OMGdeadline and do a small task every day that I work (which, note, does not include all days), big enough to warrant a GitHub checkin. (For those on the outside, being a computer science PhD student, at least if you're me, involves a lot of paper-reading, talk-preparing, writing, thinking, and "thinking" in addition to coding.) I hypothesized that this would be good for me to make incremental progress on some things that just aren't fun to do, as well as improve the general documentation state and cleanliness of my code and tools. I get pretty obsessed with arbitrary routine, so it's worked out decently well so far. (Check me out.) This policy has definitely made me write some documentation and tests I otherwise would not have written. (Although my pseudo-officemate Joe would argue that this is not "real work.") I'll report on things after we hit "OMGdeadline" and let you know how well it worked.

In the spirit of doing things in smaller increments, I'm also making it a goal to do smaller blog posts instead of the Blog Essays (also see my profile on Medium) I've gotten into a habit of doing. I've dramatically curbed my email habit (I wrote a thing here), so maybe these more frequent blog checkins will give my pent-up words somewhere to go.

Thursday, May 15, 2014

Dual Booting Windows 8.1 and Ubuntu 14.04

It appears that Windows remains ahead in this operating systems arm race: dual booting with Linux has become even more difficult. Here are some updated instructions from the last time I dual booted, in an "idealized order" I have inferred through my various failures*.
  1. Shrink the size of your Windows partition and create a new simple partition for your Linux installation to go in. (More.)
  2. Get an Ubuntu image onto a DVD or a USB drive.
  3. Turn off Fast Boot in Windows. (More.) If you don't do this, your system is going to boot straight into Windows every time.
  4. Disable Secure Boot in your BIOS. (More.)
  5. Enable UEFI and disable Legacy Boot in your BIOS. (More.) I'm not sure why this has to happen, but my Ubuntu Boot-Repair kept failing until I did this.
  6. Boot from your image. (If you haven't turned off Fast Boot, you might discover that there are new ways of doing this in Windows 8.1. But you should have turned off Fast Boot.)
  7. Follow the instructions and install Linux onto the partition you've set aside for it.
  8. Run Boot-Repair to reinstall your GRUB.
After these many steps, you should be able to enjoy the pleasures of dual boot. Enjoy.

* I found this post to be quite a helpful resource during the process. Because I somehow still kept failing, I felt that my shorter summary may be helpful for people who, like me, thought they didn't need such a detailed step-by-step.

Tuesday, April 01, 2014

Run Your Research Demo Site on the Cloud

Last week, Travis Hance and I spent hours wading through the many blog posts of the internet to figure out how to set up a simple website on Amazon EC2 using our Jeeves language, which runs on Python and C++. Because we want to spare you this trouble, we put together this definitive* post for people who want to run the simplest possible research demo site on Amazon EC2. We cover the following:
  1. How to set up an Amazon EC2 instance and SSH to it (to the install and configure whatever you like).
  2. How to set up and configure an Apache web server on your Amazon EC2 instance.
  3. How to set up your database and what to do if you want to host your own database on your Amazon EC2 instance.
  4. How to configure virtual hosts on your Apache web server if you want to use the same server to host different projects on different subdomains.
This post assumes you have experience using Django and testing things on your local machine. We're using Django 1.6.2 with Apache 2.4.9. These instructions are tailored for an Ubuntu instance, but they probably generalize as well.

Is Amazon EC2 for me?

The first thing to do is to determine whether you need to run your own EC2 server. Amazon's Elastic Compute Cloud (EC2) gives you elastic compute in the cloud. The biggest win is you can easily change how much capacity you have with minimal friction. It's also just a nice way to host servers without managing your own physical machines.

If you just need vanilla Django hosting, then you should probably find some other hosting service that can manage things for you. In our case, we wanted to use the Z3 SMT solver, which runs on C++, so we needed to run our own server.

Fellow CSAILers may be interested to learn that I have also set up a mirror site on our department's OpenStack cloud. This is free for people in our department and is useful if you don't need permanent cloud data storage.

How to set up a cloud instance.

Once you decide you want to set things up on EC2, it's pretty easy to get started. As of the time we signed up, there is a free Linux tier that gives you 750 hours at no cost. Amazon recently announced further price cuts, so the situation may be even more exciting by now. To set up your own Amazon EC2 server, sign up here and follow the i nstructions for launching a new instance.

SSHing to your EC2 instance.

In order to SSH to your instance, you will need to set the permissions of your servers to allow this. You can do this by going to your EC2 management console and adding your IP address (or all IP address if you want to live on the edge) to the "Inbound" list of allowed SSH addresses.

You'll also have to use an RSA key, which you should have generated sometime during the setup. Go to the "Instances" tab under your console to get the public DNS name. Then you can SSH to your instance:

ssh -i [location of your RSA private key] [username]@[public DNS name]

For Ubuntu instances, the username is "ubuntu."


Installing software.

Congratulations! You now have root access on an EC2 instance. You have the freedom to install software the way you would on any other machine. You can check out a copy of your code, as well as everything you need to run it, this way.

How to run a web server.

We'll be describing how to use the Apache HTTP web server for serving websites off your machine. To run your server, first download Apache and the WSGI (Web Server Gateway Interface) module for interfacing with Python programs.

sudo apt-get install apache2 libapache2-mod-wsgi

Once you have done this, you should be able to access the Apache configuration file in /etc/apache2/apache2.conf. This file tells a webpage how to interact with Apache, by describing for instance how paths should be resolved.

To make sure your Apache server knows about your demo project, first you'll want to set your Python path and alias your / path to wherever your WSGI configuration file is.

WSGIScriptAlias / /home/ubuntu/srv/testproject/testproject/
WSGIPythonPath /home/ubuntu/srv/testproject

You'll also want to add "Alias" entries for the static/ and media/ directories:

Alias /static /home/ubuntu/code/jeeves/demo/conf/static
Alias /media /home/ubuntu/code/jeeves/demo/conf/media

Finally, you'll want to add a "Directory" entry to set the permissions for the directory where you'll be serving your Python files from.

<Directory /usr/share>
  AllowOverride None
  Require all granted

To put these changes into effect, restart your Apache server:

sudo /etc/init.d/apache2 restart

You'll also want to change the permissions of your static/ and media/ directories to make them owned by the www-data group.

sudo chown -R www-data:www-data path/to/static/
sudo chown -R www-data:www-data path/to/media/

Now everything should work! Go to your hostname in the browser and see for yourself. Okay, so it is likely that there were some configuration errors and you get a "Bad request" or other error. When this happens, it is helpful to check your Apache error log, which can be found in /var/log/apache2/error.log.

Oh, and for Apache configuration files: a gotcha is that order matters, so for redirects you should put the most specific first and the most general last. A consequence of this gotcha is that if you have aliased '/' and you already have a Directory entry for '/', you need to move this to be after the Directory entry for the directory aliased to '/'.


Setting up your database.

If your Django application uses a database, you'll want to hook that up as well. Django has pretty good documentation for how to edit your for the database of your choice. You may need to install Python-specific libraries for interfacing with these databases. For instance, for MySQL you will want to install the python-mysqldb Ubuntu package. Once you have configured your database settings, running "syncdb" will set up your tables:

python syncdb

We found that our site ran much faster if we hosted the database locally. We followed the standard instructions for installing and running a MySQL database. For those who have never done this before, here is what you should expect to do:
  1. Install MySQL server.
  2. Configure your server by, for instance, setting a password for the root user.
  3. Start your MySQL server.
  4. Create a new MySQL database for use by your web application.
EC2-related: if you want to be able to access your database through SSH from other hosts (for instance, to back up your database from elsewhere), you will need to add a SQL entry to your security settings permitting access from the allowed IP address(es).


Getting ready for production.

Now you are ready to go! For your website to look the most professional, you will want to set DEBUG = False in your file. Once you do this, you will need to make sure the ALLOWED_HOSTS list includes your domain. An easy way to do this is to add the host '*' to the list.

And make sure the secret key you use in production is secret! 


How to host multiple projects on one server.

You might want to serve multiple demos, each with their own Django projects. There are a couple of ways to do this. One is to do the appropriate aliasing in your Apache configuration file for different subdirectories (For instance, If you go this route, you will have to make sure your redirects, includes, etc. point to the right place.

Another option, the one we took, is to use virtual hosts to put each project on its own subdomain. Here is how to add each new virtual host:
  1. Add a VirtualHost entry to your /etc/apache2/sites-available/[site name].conf file. For the main site the file is 000-default.conf.
  2. Enable this site:
    a2ensite [site name]  
  3. Reload your Apache configuration:
    sudo /etc/init.d/apache2 reload
Here is my VirtualHost configuration for that lives in my /etc/apache2/sites-available/ file. This post is getting long so I'm getting too lazy to explain all the parts, but you can see how I'm specifying paths, aliases, listening on port 80, and all that good stuff.

<VirtualHost *:80>
    DocumentRoot /home/ubuntu/code/jeeves/demo/conf

    WSGIDaemonProcess jconf processes=5 threads=1
    WSGIScriptAlias / /home/ubuntu/code/jeeves/demo/conf/
    ErrorLog /var/log/apache2/jconf-error.log

    Alias /static /home/ubuntu/code/jeeves/demo/conf/static
    Alias /media /home/ubuntu/code/jeeves/demo/conf/media
    Alias /logs /home/ubuntu/code/jeeves/demo/conf/logs

    <Directory /home/ubuntu/code/jeeves/demo/conf>
        Order deny,allow
        Allow from all

Note that if you want things to run on subdomains, 1) you will need to use your own domain (rather than Amazon EC2's dynamically assigned DNS) and 2) you need to make sure you have DNS entries for the subdomains (you need to tell someone which IP addresses you would like for these subdomains to resolve to). There are instructions here about setting up your own domain name with EC2. Instructions for mapping subdomains will vary based on domain manager. (For CSAIL domains created with WebDNS, you can create subdomains by editing your hostname file and adding aliases for your subdomains.)


A final word.

There are a lot of details (version numbers; deprecation; death) involved with these web things, but it is so satisfying to get everything working. And if at first you don't succeed, try, try, try again.

* This claim is intended to be tongue-in-cheek. I had told Travis that there was so much misinformation on the internet that I wanted to write the definitive blog post. He laughed because this sentiment surely motivates every other post out there.

Tuesday, January 07, 2014

Careful Where You Click

The other day, a friend and I were talking about how fun it was to check Google Analytics. When I asked her if she knew most of her readers, she said she could figure out who many of them were. Especially if they were outside New York. Especially if they were in some remote location.

Google Analytics, a fantastic tool for optimizing your website, can also be a precise tool for stalking. All you need to do is insert a bit of code in the header and you can know who has visited your website, how they came across your website, how long they visited your site, and whether they have been to your site before. I show on the left a screen shot of analytics on the visitors from Florida for one of my websites*. Since there is only one person accessing the site from each of Gainesville and South Miami, we can figure out how much time each person spent on the site. As my friend Rishabh would say, "Walk the walks, stalk the stalks."

We've recently seen that even if you use tools that supposedly hide you, there are still ways you can be tracked. Last month, Special Agent Thomas Dalton released this document of how he tracked down that the bomb threats to Harvard buildings during exam originated from a Harvard student hoping to avoid exams. To avoid detection, the student had used Tor, software that prevents others from detecting the IP address from which an action originates--akin to anonymizing a phone number before making a prank call. Unfortunately for the student, the Harvard network was able to figure out who was accessing Tor at the time the e-mail was sent, narrowing the suspect pool down to our perpetrator.

There are many ways that you can be exposed if you are among a small handful of people engaging in some behavior. Last month, I attended a fascinating talk by Mike Specter at MIT about how people can track searches of obscure terms. Mike had done a search for "Pentagon SMO code" and discovered there was a (Chinese government) website showing up towards the top of the search results. He clicked on this website, he discovered that it showed its own search results for the terms--but not before tracking that he had visited the site**. Upon further investigation, Mike discovered that the site had performed search-engine optimization so that it could show up towards the top of the search results. Because search engines rely on automated algorithms, people can trick the algorithms into thinking their page is more important or more relevant to a given search term. This page used particularly insidious optimization techniques, including putting invisible comments called "pingbacks" on legitimate websites (such as the Yale University homepage) and using the domain so as to show up in all searches and not just searches in China. (Google personalizes results based on search location.) Mike found that this website showed up for many long-tail (obscure) searches, both relevant to the American government and otherwise. While it's not clear what this site is doing with this information (Mike speculates profit motives), what is clear is that people can easily be exposed while performing obscure searches.

While we can hope that researchers like Mike continue to watch out for us, we will also need to learn to watch our for ourselves. As our lives come to depend on increasingly complex technology, it is important for everyone to develop a basic level of technological literacy. It is clear that technology can be quite powerful in reducing and compromising internet privacy. It is up to us to define how much we allow: by raising awareness, by taking precautions, and by protesting when we find something to be unreasonable. As recent legislative decisions have shown, powerful people have been doing a lot of thinking about the future of the internet. It is up to us to make sure the future is not one in which we have no choice but to be exposed!

* Not this blog, guys. Florida readers, you remain anonymous here! 
** How much the site can find out about you depends on the level of sophistication of the site and the measures you have taken to hide yourself. The WSJ has a nice widget at the top that tells you what they can find out about you. I show mine below.

Wednesday, January 01, 2014

2014 is the Year of Laughing like Kafka

This year, I'm trying this trend where I have a theme instead of a resolution. And this isn't just because I broke all of my concrete resolutions by April of last year...

For 2014, the theme is to laugh like Kafka. Franz Kafka, who wrote the most dark and wonderful stories, would encourage everyone to laugh at the absurdly dark situations of his protagonists. During readings, apparently he would laugh so jovially and in such contrast with the grim content that people would be confused. Instead of feeling stressed or angry or scared, I want to laugh like Kafka at the absurdity of my own life.

It's been getting harder and harder to not take life too seriously. My mother is always telling me, "Aren't you getting a little old to have blogs where you take pictures of yourself wearing funny glasses?" (See here and here.) Junior Ph.D. students keep saying to me, "Aren't you really old? Why aren't you more serious?" And people are always saying, "You went to Harvard and MIT. Shouldn't you be, like, really serious?" Even though I'm getting really old and I subscribe to The Economist, this does not mean I need to have a permanent scowl.

There are many things to laugh about. How the mental health of Ph.D. students can be measured in terms of the number of days, and sometimes hours, in between existential crises. The fact that there exist contraptions called epilators that have dozens of tiny tweezers for pulling out body hair. That time I spent the most money ever on a tasting menu meal and then spent the next day incapacitated, conducting several important meetings via phone from bed, purging my digestive system in between. Misogynists, racists, xenophobes, and homophobes. Plagues of frogs, locusts, darkness, and death of the firstborn...

Laughing does not mean being disrespectful or apathetic. It simply involves seeing a situation in a way that is less weighty and overwhelming. So whenever I am looking less than happy in 2014, ask whether I should be laughing instead.