facebook likeGetty/Justin Sullivan
Social media users now number more than 1.4 billion — more than half of the Earth’s Internet-using population. We share a lot of information on social media, but it turns out we are sharing far more than we think. Seemingly innocuous information, when analyzed against tens of thousands of other profiles, can reveal secrets you never intended to share.
New computer algorithms borrow concepts from psychology and sociology and use them to find patterns in massive amounts of data. Most of these algorithms are sitting on a researcher’s computer in a lab somewhere (I have some of them myself), inaccessible to the public. That means your secrets are relatively protected, but it also means you can’t find out what the algorithms know about you.
That is changing, though. A couple of very interesting tools have now come online, allowing you to find out just what an algorithm can guess about you from your social media profiles.
One study in this space, published in 2013 by researchers at the University of Cambridge and their colleagues, gathered data from 60,000 Facebook users and, with their Facebook “likes” alone, predicted a wide range of personal traits. The researchers could predict attributes like a person’s gender, religion, sexual orientation, and substance use (drugs, alcohol, smoking).
These predictions didn’t come from obvious likes. For example, if someone likes the Republican Party’s Facebook page, it probably indicates that person is a Republican. But the connections used in this research are not obvious. The top likes that were strongly indicative of high intelligence, for instance, were “Thunderstorms,” “The Colbert Report,” “Science,” and “Curly Fries.”
How could liking curly fries be predictive? The reasoning relies on a few insights from sociology. Imagine one of the first people to like the page happened to be smart. Once she liked it, her friends saw it. A social science concept called homophily tells us that people tend to be friends with people like themselves. Smart people tend to be friends with smart people. Liberals are friends with other liberals. Rich people hang out with other rich people.
So if a smart person likes curly fries, her (smart) friends see that and some of them like the page, too. The same goes for their friends and so on. Basically, the liking spreads through a part of the social network that happens to be more intelligent. After a while, liking the curly fries page happens to become a thing that smart people do. When you like it, the algorithm guesses that you must also be smart.
For a long time, we could only imagine the things these algorithms could predict about us as individuals. Fortunately, last year the Cambridge researchers put their algorithms online—and two months ago, they completely revamped their websites to give you even better info about yourself. There are two sites where you connect with your Facebook account. The algorithms access and analyze your likes, run statistical comparisons, and make guesses about you.
On the first site, YouAreWhatYouLike, the algorithms will tell you about your personality. This includes openness to new ideas, extraversion and introversion, your emotional stability, your warmth or competitiveness, and your organizational levels.
The second site, Apply Magic Sauce, predicts your politics, relationship status, sexual orientation, gender, and more. You can try it on yourself, but be forewarned that the data is in a machine-readable format. You’ll be able to figure it out, but it’s not as pretty as YouAreWhatYouLike.
These aren’t the only tools that do this. AnalyzeWords leverages linguistics to discover the personality you portray on Twitter. It does not look at the topics you discuss in your tweets, but rather at things like how often you say “I” vs. “we,” how frequently you curse, and how many anxiety-related words you use. The interesting thing about this tool is that you can analyze anyone, not just yourself.
The most important lesson to take away from these algorithms is that you cannot control what is predicted. If you wanted to hide your religion, for example, it would not be enough to steer clear of any overt signs online. Seemingly unrelated acts, such as liking a particular animal or TV show, could be part of an unseen pattern that reveals your faith. The unconscious choices that influence your language patterns could reveal it, too. That means we as users have very little control over these algorithmic insights. (If you want more detail on these algorithms, you might want to check out my TED talk.)
For now, these tools are mostly just helpful to users who want to understand these algorithms and some of what they are discovering, but there will be real implications when these tools become more widely available in their full forms.
What happens if the algorithm reveals you are gay and you live in a country where that is illegal? What happens if an algorithm discovers you are single and pregnant and you work for a company with religious management who will fire you for acting out of line with their beliefs? What if your insurance company uses this to discover your medical conditions and then uses the results to set your level of coverage? What if social media predictions about your reliability and financial state are used in your credit score? What if these inferred attributes are used by law enforcement when you are suspected of a crime? Or to put you on a terrorist watch list? Especially when the algorithms can be wrong.
Right now, there are basically no legal protections in the U.S. that would prevent someone from collecting your data and using it in these ways. However, there are steps you can take to protect yourself.
These algorithms rely on having data about you. The less information that's available, the less effectively they work. In fact, when I tried to run YouAreWhatYouLike on myself, I got this message:
Sorry, it seems like you have too few Facebook Likes for us to accurately predict your personality. As psychometricians we believe in understanding people, not guesswork, so please check back another time.
Thanks for trying You Are What You Like!
Since January, I’ve been regularly purging my social media profiles. I delete just about everything—every like, comment, and post—that is more than three weeks old. It means there is not much out there that can be used to predict things about me. More automated tools to support this are popping up. (Tweet Delete is a great example — it’s easy to use and straightforward in what it does.)
This strategy won’t solve every problem, though. Social media companies can keep copies of your deleted data. You leave digital traces in other ways, too, like when you use your credit card or a store loyalty card. Your Web browsing behavior is also tracked (though there are browser options and extensions that will block a lot of this).
Still, careful curation of social media profiles will keep a lot of data out of the hands of third parties — and it’s a smart strategy for controlling the impression you give to other people, too.
Disclosure: The author of this piece previously served as a technical witness in a patent lawsuit against Facebook.
This article is part of Future Tense, a collaboration among Arizona State UniversityNew America, and Slate. Future Tense explores the ways emerging technologies affect society, policy, and culture. To read more, visit the Future Tense blog and the Future Tense home page. You can also follow us on Twitter.
This article originally appeared at Slate. Copyright 2014. Follow Slate on Twitter.