The popular “emoji” smiley faces are banned as gTLD domain names for technical reasons, according to ICANN.
Emojis are a form of emoticon that originated on Japanese mobile networks but are now used by 12-year-old girls worldwide due to their support on Android and iPhone operating systems.
It emerged last week that Coca-Cola has registered a bunch of smiley-face domain names under .ws, the Samoan ccTLD, for use in an billboard advertising campaign in Puerto Rico.
.ws was selected because it’s one of only a few TLDs that allow emojis to be registered. Coke is spinning its choice of TLD as an abbreviation for “We Smile”.
This got me thinking: would emojis be something new gTLD registries could start to offer in order to differentiate themselves?
Coke’s emoji domains, it turns out, are just a form of internationalized domain name, like Chinese or Arabic or Greek.
Emoji symbols are in the Unicode standard and could therefore be converted to the ASCII-based, DNS-compatible Punycode under the hood in web browsers and other software.
One of Coke’s (smiley-face).ws domain names is represented as xn--h28h.ws in the DNS.
Unfortunately for gTLD registries, ICANN told DI last night that emojis are not permitted in gTLDs.
“Emoticons cannot be used as IDNs as these code points are DISALLOWED under IDNA2008 protocol,” ICANN said in a statement.
IDNA2008 is the latest version of the IETF standard used to define what Unicode characters can and cannot appear in IDNs.
RFC 5892 specifies what can be included in an IDNA2008 domain name, eliminating thousands of letters and symbols that were permissible under the old IDNA2003 standard.
These characters were ostensibly banned due to the possibility of IDN homograph attacks — when bad guys set up spoof web sites on IDNs that look almost indistinguishable from a domain used by, for example, a bank or e-commerce site.
But Unicode, citing Google data, reckons symbols could only ever be responsible for 0.000016% of such attacks. Most homograph attacks are much simpler, relying on for example the visual similarity of I and l.
Regardless, because IDNA2008 only allows Unicode characters that are actually used in spoken human languages, and because gTLD registries are contractually obliged to adhere to the IDNA2008 technical standards, emojis are not permitted in gTLDs.
All new gTLDs have to provide ICANN with a list of the Unicode code points they plan to support as IDNs when they undergo pre-delegation testing. Asking to support characters incompatible with IDNA2008 would result in a failed test, ICANN tells us.
ICANN does not regulate ccTLDs, of course, so the .ws registry is free to offer whatever domains it wants.
However, ICANN said that emoji domains are only currently supported by software that has not implemented the newer IDN protocol:
Emoticon domains only work in software that has not implemented the latest IDNA standard. Only the older, deprecated version of the IDNA standard allowed emoticons, more or less by accident. Over time, these domains will increasingly not work correctly as software vendors update their implementations.
So Coke, while winning brownie points for novelty, may have registered a bunch of damp squibs.
ICANN also told us that, regardless of what the technical standards say, you’d never be able to apply for an emoticon as a gTLD due to the “letters only” principle, which already bans numbers in top-level strings.
ICANN is thinking about expanding its controversial policy on name collisions from new gTLDs to new ccTLDs.
The country code Names Supporting Organization has been put on notice (pdf) that ICANN’s board of directors plans to pass a resolution on the matter shortly.
The resolution would call on the ccNSO to “undertake a study to understand the implications of name collisions associated with the launch of new ccTLDs” including internationalized domain name ccTLDs, and would “recommend” that ccTLD managers implement the same risk mitigation plan as new gTLDs.
Because ICANN does not contract with ccTLDs, a recommendation and polite pressure is about as far as it can go.
Name collisions are domains in currently undelegated TLDs that nevertheless receive DNS root traffic. In some cases, that may be because the TLDs are in use on internal networks, raising the potential of data leakage or breakages if the TLDs are then delegated.
ICANN contracts require new gTLDs to block such names or wildcard their zones for 90 days after launch.
Some new gTLD registry executives have mockingly pointed to the name collisions issue whenever a new ccTLD has been delegated over the last year or so, asking why, if collisions are so important, the mitigation plan does not apply to ccTLDs.
If the intent was to persuade ICANN that the collisions management framework was unnecessary, the opposite result has been achieved.
Attentive DI readers will recall my journalistic meltdown last week, when I tried to figure out how the Chinese new gTLD .网址 managed to hit #2 in the new gTLD zone file size league table, apparently shifting a quarter of a million names in a week.
Well, after conversations with well-placed sources here at NamesCon in Las Vegas this week, I’ve figured it out.
.网址 is the Chinese for “.url”.
Its rapid growth — hitting 352,000 names today — can be attributed primarily to two factors.
First, these weren’t regular sales. The registry, Knet, which acquired original applicant Hu Yi last year, operates a keyword-based navigation system in China that predates Chinese-script gTLDs.
The company has simply grandfathered its keyword customers into .网址, I’m told.
The keyword system allows Latin-script domains too, which explains the large number of western brands that appear in the .网址 zone.
The second reason for the huge bump is the fact that many of the domains are essentially duplicates.
Chinese script has “traditional” and “simplified” characters, and in many cases domains in .网址 are simply the traditional equivalents of the simplified versions.
I understand that these duplicates may account for something like 30% of the zone file.
I’ve been unable to figure out definitively why the .网址 Whois database appeared to be so borked.
As I noted last week, every domain in the .网址 space had a Knet email address listed in its registrant, admin and technical contact fields.
It seems that Knet was substituting the original email addresses with its own when Whois queries were made over port 43, rather than via its own web site.
Its own Whois site (which doesn’t work for me) returned the genuine email addresses, but third-party Whois services such as DomainTools and ICANN returned the bogus data.
Whether Knet did this by accident or design, I don’t know, but it would have almost certainly have been a violation of its contractual commitments under its ICANN Registry Agreement.
However, as of today, third-party Whois tools are now returning the genuine Whois records, so whatever the reason was, it appears to be no longer an issue.
The Chinese-script gTLD .网址 powered to the number two spot in the new gTLD rankings by zone file size this week, but it’s doing some things very strangely.
.网址 is Chinese for “.site”, “.url” or “.webaddress”.
The registry is Hu Yi Global, ostensibly a Hong Kong-based registrar but, judging by IANA’s records, actually part of its Beijing-based back-end Knet.
I’m going to come out and admit it: even after a few hours research I still don’t know a heck of a lot about these guys. The language barrier has got me, and the data is just weird.
These are the things I can tell you:
- .网址 has 352,727 domains in its zone file today, up by about a quarter of a million names since the start of the week.
- The names all seem to be using knet.cn name servers
- I don’t think any of them resolve on the web. I tried loads and couldn’t find so much as a parking page. Google is only aware of about eight resolving .网址 pages.
- They all seem to have been registered via the same Chinese registrar, which goes by the name of ZDNS (also providing DNS for the TLD itself).
- They all seem to be registered with “firstname.lastname@example.org” in the email address field for the registrant, admin and technical contacts in Whois, even when the registrants are different.
- That’s even true for dozens of famous trademarks I checked — whether it’s the Bank of China or Alexander McQueen, they’re all using email@example.com as their email address.
- I’ve been unable to find a Whois record with a completed Registrant Organization field.
- Nobody seems to be selling these things. ZDNS (officially Internet Domain Name System Beijing Engineering Research Center) is apparently the only registrar to sell any so far and its web site doesn’t say a damn thing about .网址. The registry’s official nic.网址 site doesn’t even have any information about how to buy one either.
- ZDNS hasn’t sold a single domain in any other gTLD.
- News reports in China, linked to from the registry’s web site, boast about how .网址 is the biggest IDN TLD out there.
So what’s going on here? Are we looking at a Chinese .xyz? A bunch of registry-reserved names? A seriously borked Whois?
Don’t expect any answers from DI today on this one. I’ve been staring at Chinese characters for hours and my brain is addled.
I give up. You tell me.
The .wang gTLD has seen great success, relatively, in its first week of general availability, crossing the 30,000 mark yesterday and entering the top 10 new gTLDs by registration volume.
At its current rate of growth, the Zodiac Holdings domain is going to overtake .在线, the highest-ranking Chinese gTLD so far, this week.
.wang went to GA June 30. After its initial spike, it’s added one to two thousand names per day and, with 31,011 names today, currently sits at 9th place in the new gTLD program’s league table.
That’s a whisker behind TLD Registry’s .在线 (“.online”), which had a strong start when it launched at the end of April but has since plateaued at around 33,000 names, adding just a handful each day.
A skim through the zone files reveals that the vast majority of the names in .wang appear to be, like .wang itself, Pinyin — the official Latin-script transliterations of Chinese-script words.
.wang, which would be “网” in Chinese script, means “net”.
To pluck a couple of names from the zone at random, I see tanpan.wang, which could mean something like “negotiation.net” and xingshi.wang, which may or may not mean “shape.net”.
I suspect that many of the registered domains are personal names rather than dictionary words. Wang is a popular surname in China.
The vast majority of the names also appear to be registered via China-based registrars, some of which are promoting the TLD strongly on their home pages.
There certainly appears to be a lot of domainer activity in .wang, but I haven’t seen anything yet to suggest a massive orchestrated effort that would throw out the numbers considerably.
Either way, I find it fascinating that a Latin transliteration of a Chinese word seems set to out-perform the actual Chinese IDNs currently on the market.