Could crypto solve the Whois crisis?
Could there be a cryptographic solution to some of the problems caused by GDPR’s impact on public Whois databases? Security experts think so.
The Anti-Phishing Working Group has proposed that hashing personal information and publishing it could help security researchers carry on using Whois to finger abusive domain names.
In a letter to ICANN, APWG recently said that such a system would allow registries and registrars to keep their customers’ data private, but would still enable researchers to identify names registered in bulk by spammers and the like.
“Redacting all registration records which were formerly publicly available has unintended and undesirable consequences to the very citizens and residents that electronic privacy legislation intends to protect,” the letter (pdf) says.
Under the proposed system, each registry or registrar would generate a private key for itself. For each Whois field containing private data, the data would be added to the key and hashed using a standard algorithm such as SHA-512.
For items such as physical addresses, all the address-related fields would be concatenated, with the key, before hashing the combined value.
The resulting hash — a long string of gibberish characters — would then be published in the public Whois instead of the [REDACTED] notice mandated by current ICANN policy.
Security researchers would then be able to identify domains belonging to the same purported registrant by searching for domains containing the same hash values.
It’s not a perfect solution. Because each registry or registrar would have their own key, the same registrant would have different hash values in different TLDs, so it would not be possible to search across TLDs.
But that may not be a huge problem, given that bad guys tend to bulk-register names in TLDs that have special offers on.
The hashing system may also be beneficial to interest groups such as trademark owners and law enforcement, which also look for registration patterns when tracking down abuse registrants.
The proposal would create implementation headaches for registries and registrars — which would actually have to build the crypto into their systems — and compliance challenges for ICANN.
The paper notes that ICANN would have to monitor its contracted parties — not all of which may necessarily be unfriendly to spammers — to make sure they’re hashing the data correctly.
Considering that registrars information seems more authoritative than registries in a GDPR world, it could be done at registrars only, with registries pointing to that information with RDAP referrals.
But I wonder if some more thinking could be applied towards a hash applicable in all registrars and gTLDs, that could be used to cross-check ownership across them all. Will try thinking on something.
I’m rather sure this will not be GDPR-compliant. Anonymizing information needs to happen in such a way that it can not be de-anonymized.
If for example the domain name north-korea.golf would have the same hash as trump.golf and you would somehow be able to guess who owned trump.golf, then the information of north-korea.golf could be de-anonymized.
Interesting point. One scenario that springs to mind is if a domain initially uses the hash, but the owner later opts out of privacy.
This would be an interesting question for ICANN to ask EDPB… the problem is they asked so many times the same questions, that now EDPB is unwilling to answer ICANN anymore.
Rubens, no need to ask them, this is already self evident for anyone with the slightest amount of legal knowledge in data privacy.
As long as data can be used to conclude who is being it, that data is also personal information. Like an IP address, a hash-value allows direct determination of the individual data behind it, and therefore needs to be considered personal information.
Non-private data that allows third parties to figure out private data is also private data. Just look at the court decisions regarding IP addresses.
Thanks for confirming my assumption. It is a pity that apparently somebody is putting serious time (and money?) into this while clearly having less knowledge of GDPR than… let’s say… me.
Was anybody from Europe actually involved in this proposal? Because somehow I’m getting the idea that people from the US just can’t get their head around the whole privacy concept.
I agree with Bart, while this might not be GDPR compliant in its current shape, it is a first step using privacy by design. PbD is almost absent in all the proposals of various stakeholders. As such, they keep dealing with a very complex situation and making it near impossible to solve the issues they face and their purposes.
But this is a step in the right direction, just a matter of further finetuning to reach the desired goal.
I suspect this would be fairly trivial to thwart, simply having a catchall on @badguysemail.com and having each application submitted with registrant1@ registrant2@ etc would surely achieve unique hashes for every domain?
Certainly what I used to do to deal with whois spam 🙂