Latest news of the domain name industry

Recent Posts

ICANN’s Sword algorithm fails Bulgarian IDN test

Kevin Murphy, June 1, 2010, 11:52:32 (UTC), Domain Registries

ICANN has released version 4 of its new TLD Draft Applicant Guidebook (more on that later) and it still contains references to the controversial “Sword” algorithm.
As I’ve previously reported, this algorithm is designed to compare two strings for visual similarity to help prevent potentially confusing new TLDs being added to the root.
The DAG v4 contains the new text:

The algorithm supports the common characters in Arabic, Chinese, Cyrillic, Devanagari, Greek, Japanese, Korean, and Latin scripts. It can also compare strings in different scripts to each other.

So I thought I’d check how highly the internationalized domain name .бг, the Cyrillic version of Bulgaria’s .bg ccTLD, scores.
As you may recall, .бг was rejected by ICANN two weeks ago due to its visual similarity to .br, Brazil’s ccTLD. As far as I know, it’s the only TLD to date that has been rejected on these grounds.
Plugging “бг” into Sword returns 24 strings that score over 30 out of 100 for similarity. Some, such as “bf” and “bt”, score over 70.
Brazil’s .br is not one of them.
Using the tool to compare “бг” directly to “br” returns a score of 26. That’s a lower score than strings such as “biz” and “org”.
I should note that the Sword web page is ambiguous about whether it is capable of comparing Cyrillic strings to Latin strings, but the new language in the DAG certainly suggests that it is.

Tagged: , , , , , , , ,

Comments (6)

  1. The algorithm is ambitious, comprehensive, and doesn’t work worth a damn. I think ICANN has tacitly admitted as much by saying it will be just one of methods used.
    We know it as the S-Word algorithm.

  2. John Egbert says:

    ‘Looks like it works now…

  3. Николай Филипов/Nikolay Filipov says:

    And it seems that NOW sword/icann algorythm has been fixed by someone to report:
    бг vs bf : 78 % [UC]
    (previousely it was 26% and though was not listed)
    I doubt who may be that consilliere. I got his name.
    The whole “tool” stinks. Their methodology either.
    I would like to see that’s tool criterion published, and explained

  4. Daniel Kalchev says:

    Today, (22.06.2010 15:13 – just in case it changes tables via random number generator every hour or so) the tool returns:
    бг vs. bf – 78%
    бг vs. br – 68%
    бя vs. br – 81%
    бъ vs. bb – 73%
    The last two strings were suggested to the Bulgarian government as alternatives by some … experts.
    This whole game, if not pissing off a whole nation, would be just funny story.

Add Your Comment