ICANN’s Sword algorithm fails Bulgarian IDN test

ICANN has released version 4 of its new TLD Draft Applicant Guidebook (more on that later) and it still contains references to the controversial “Sword” algorithm.

As I’ve previously reported, this algorithm is designed to compare two strings for visual similarity to help prevent potentially confusing new TLDs being added to the root.

The DAG v4 contains the new text:

The algorithm supports the common characters in Arabic, Chinese, Cyrillic, Devanagari, Greek, Japanese, Korean, and Latin scripts. It can also compare strings in different scripts to each other.

So I thought I’d check how highly the internationalized domain name .бг, the Cyrillic version of Bulgaria’s .bg ccTLD, scores.

As you may recall, .бг was rejected by ICANN two weeks ago due to its visual similarity to .br, Brazil’s ccTLD. As far as I know, it’s the only TLD to date that has been rejected on these grounds.

Plugging “бг” into Sword returns 24 strings that score over 30 out of 100 for similarity. Some, such as “bf” and “bt”, score over 70.

Brazil’s .br is not one of them.

Using the tool to compare “бг” directly to “br” returns a score of 26. That’s a lower score than strings such as “biz” and “org”.

I should note that the Sword web page is ambiguous about whether it is capable of comparing Cyrillic strings to Latin strings, but the new language in the DAG certainly suggests that it is.