Governments and corporations gather, store, and analyze the tremendous amount of data we chuff out as we move through our digitized lives. Often this is without our knowledge, and typically without our consent. Based on this data, they draw conclusions about us that we might disagree with or object to, and that can impact our lives in profound ways. We may not like to admit it, but we are under mass surveillance.
Much of what we know about the United States National Security Agency (NSA)’s surveillance comes from Edward Snowden, although people both before and after him also leaked agency secrets. As an NSA contractor, Snowden collected tens of thousands of documents describing many of the NSA’s surveillance activities. Then in 2013 he fled to Hong Kong and gave them to select reporters. For a while I worked with Glenn Greenwald and the Guardian newspaper, helping analyze some of the more technical documents.
The first news story to break based on the Snowden documents described how the NSA collects the cell phone call records of every American. One government defense, and a sound bite repeated ever since, is that the data they collected is “only metadata.” The intended point was that the NSA wasn’t collecting the words we said during our phone conversations, only the phone numbers of the two parties, and the date, time, and duration of the call. This seemed to appease many people, but it shouldn’t have. Collecting metadata on people means putting them under surveillance.
An easy thought experiment demonstrates this. Imagine that you hired a private detective to eavesdrop on someone. The detective would plant bugs in that person’s home, office, and car. He would eavesdrop on that person’s phone and computer. And you would get a report detailing that person’s conversations.
Now imagine that you asked the detective to put that person under surveillance. You would get a different but nevertheless comprehensive report: where he went, what he did, who he spoke to and for how long, who he wrote to, what he read, and what he purchased. That’s metadata.
Eavesdropping gets you the conversations; surveillance gets you everything else.
Telephone metadata alone reveals a lot about us. The timing, length, and frequency of our conversations reveal our relationships with each other: our intimate friends, business associates, and everyone in-between. Phone metadata reveals what and whom we’re interested in and what’s important to us, no matter how private. It provides a window into our personalities. It provides a detailed summary of what’s happening to us at any point in time.
One experiment from Stanford University examined the phone metadata of about 500 volunteers over several months . The personal nature of what the researchers could deduce from the metadata surprised even them, and the report is worth quoting:
- Participant A communicated with multiple local neurology groups, a specialty pharmacy, a rare condition management service, and a hotline for a pharmaceutical used solely to treat relapsing multiple sclerosis.
- Participant B spoke at length with cardiologists at a major medical center, talked briefly with a medical laboratory, received calls from a pharmacy, and placed short calls to a home reporting hotline for a medical device used to monitor cardiac arrhythmia.
- Participant C made a number of calls to a firearms store that specializes in the AR semiautomatic rifle platform. They also spoke at length with customer service for a firearm manufacturer that produces an AR line.
- In a span of three weeks, Participant D contacted a home improvement store, locksmiths, a hydroponics dealer, and a head shop.
- Participant E had a long early morning call with her sister. Two days later, she placed a series of calls to the local Planned Parenthood location. She placed brief additional calls two weeks later, and made a final call a month after.
That’s a multiple sclerosis sufferer, a heart attack victim, a semiautomatic weapons owner, a home marijuana grower, and someone who had an abortion, all from a single stream of metadata.
Web search data is another source of intimate information that can be used for surveillance. (You can argue whether this is data or metadata. The NSA claims it’s metadata because your search terms are embedded in the URLs.) We don’t lie to our search engine. We’re more intimate with it than with our friends, lovers, or family members. We always tell it exactly what we’re thinking about, in as clear words as possible. Google knows what kind of porn each of us searches for, which old lovers we still think about, our shames, our concerns, and our secrets. I used to say that Google knows more about what I’m thinking than my wife does. But that doesn’t go far enough. Google knows more about what I’m thinking than I do, because Google remembers all of it perfectly and forever.
I did a quick experiment with Google’s autocomplete feature. This is the feature that offers to complete typing your search queries in real time, based on what other people have typed. When I typed “should I tell my w,” Google suggested “should i tell my wife i had an affair” and “should i tell my work about dui” as the most popular completions. Google knows who clicked on those completions, and everything else they ever searched.
Google’s CEO Eric Schmidt admitted as much in 2010: “We know where you are. We know where you’ve been. We can more or less know what you’re thinking about.”
If you have a Gmail account, you can check for yourself. You can look at your search history for any time you were logged in. It goes back for as long as you’ve had the account, probably for years. Do it; you’ll be surprised. It’s more intimate than if you’d sent Google your diary. And while Google lets you see it, you have no rights to delete anything you don’t want there.
There are other sources of intimate data and metadata. Records of your purchasing habits reveal a lot about who you are. Your Tweets tell the world what time you wake up in the morning, and what time you go to bed each night. Your buddy lists and address books reveal your political affiliation and sexual orientation. Your email headers reveal who is central to your professional, social, and romantic life.
One way to think about it is that data is content, and metadata is context. Metadata can be much more revealing than data, especially when collected in the aggregate. When you have one person under surveillance, the contents of conversations, text messages, and emails can be more important than the metadata. But when you have an entire population under surveillance, the metadata is far more meaningful, important, and useful.
As former NSA General Counsel Stewart Baker said: “Metadata absolutely tells you everything about somebody’s life. If you have enough metadata you don’t really need content.” In 2014, former NSA and CIA director Michael Hayden remarked: “We kill people based on metadata.”
The truth is, though, that the difference is largely illusionary. It’s all data about us.
This keynote was adapted from the author’s book, Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World,” published in 2015.
Bruce Schneier is an American cryptographer, computer security and privacy specialist who has written several books on security and cryptography. He is based in Minneapolis, MN, and can be reached at firstname.lastname@example.org