FTC PrivacyCon: Your email address is leaking and vulnerable
Opening an email can unknowingly share your email address with other vendors, and hashed emails — a popular ‘protection’ among data providers — can be hacked.
A research paper delivered this morning at the Federal Trade Commission’s third annual PrivacyCon shows how email addresses — the gold standard for linking personal datasets — are vulnerable to data leaks and hacking.
The title of the paper by researchers Steven Englehardt, Jeffrey Han and Arvind Narayanan summarizes the possible reactions of many users hearing about this:
“I never signed up for this! Privacy implications of email tracking”
First of all, Englehardt said in presenting the paper, the act of opening emails often results in processes that resemble web tracking.
If your email client says it is blocking the retrieval of images for your security, part of the reason is that the URL calling those images from other vendors could contain your email address.
In a test, the researchers signed up for hundreds of email lists, received a zillion emails and analyzed what info those emails sent out. Some of the received emails, Englehardt said, were connected to as many as two dozen outside vendors through calls to content like images or to other external services.
“Many (of the calls to those vendors) were leaking the email address,” he said.
Englehardt noted that emailers like email ad monetization platform LiveIntent say they hash users’ email addresses.
Hashing is an algorithmic process that turns user@domain.com into a gibberish label, like $o98Cis?. Although gibberish, it’s unique, so it can be employed as an anonymized identifier.
It’s supposed to be one-way, meaning that you can’t turn the gibberish back into the email addresses.
Wrong, says Englehardt and his colleagues. “Hashes are likely to be easily reversible,” he told PrivacyCon.
‘Little to protect the privacy’
Essentially, he said, there is a finite universe of email addresses, estimated by researchers to be something over four billion. You can cheaply buy massive lists, so that, conceivably, a hacker could inexpensively build a database of virtually every email address on the planet.
And then the hacker could use brute computing force to test every email address through the hashing algorithm, until the unique gibberish identifier appears.
Bingo.
“Due to [high-end processors],” the researchers wrote in their paper, “trillions of hashes can be attempted at low cost.”
The authors found that “hashing does little to protect the privacy of a user’s email address.”
This is an arrow straight at the heart of many kinds of deterministic data matching, particularly cross-device and “people-based marketing.”
The reason is that an email address is frequently used as the “persistent identifier” to match the user data between, say, your phone, your laptop and your tablet. If you’ve logged on to something on each of those devices using your email address, a data provider is likely to be able to match you via your email address and determine that those are your devices.
Similarly, email addresses are commonly used to match your offline purchases to your online behavior, creating whole-person profiles.
“But don’t worry about our using personally identifiable information (PII) like email addresses,” data providers usually say in effect.
“The address is hashed.”
Marketing Land – Internet Marketing News, Strategies & Tips
(8)