Anonymized
Somebody has sued Netflix, claiming that their release of “anonymized” data as part of the Netflix prize allowed her to be identified. What’s particularly interesting is how it went down:
just weeks after the contest began, two University of Texas researchers — Arvind Narayanan and Vitaly Shmatikov — identified several NetFlix users by comparing their “anonymous” reviews in the Netflix data to ones posted on the Internet Movie Database website. Revelations included identifying their political leanings and sexual orientation.
Putting aside the suit, it’s interesting to think about how anonymous any data can be when their is a plethora of non-anonymous data available for comparison. This is more interesting than the AOL search data because in that case the data itself included the clues. (Here’s a New York Times article about the AOL incident if you want a refresh.) I suspect we will see a lot more cases like this in the future.