Touched by an IDN: Farsight Security shines a light on the Internet's oft-ignored and undetected security problem



Executive Summary

Committed to making online interactions safer for all users, Farsight Security regularly investigates systemic threats to the Internet. The design and implementation of the DNS Internationalized Domain Name (IDN) system poses such a threat – one well known by DNS industry insiders and security professionals but not known or well understood by the wider public. The purpose of this research is to bridge that knowledge gap – to offer a keyhole glimpse into the shadowy world of brand lookalike abuse via IDN homographs.

Registration of confusing Internet DNS names for the purpose of misleading consumers is not news. Every user of the Internet learns – often the hard way – that much of the email they receive is forged, and many of the World Wide Web links they are prompted to click on are malicious. Yet IDN, a DNS standard representing non-English domain names, allows forgeries to be nearly undetectable by either human eyes or human judgement, or by traditional Internet user interface tools such as email clients and web browsers.

Using its real-time DNS network, Farsight Security conducted new research to determine the prevalence and reach of homographs, in the form of IDN lookalike domains, across the Internet. Specifically, Farsight examined 125 top brand domain names, including large content providers, social networking giants, financial websites, luxury brands, cryptocurrency exchanges and other popular websites. Our findings underscore that the potential security risk posed by IDN homographs is significant. Any ultimate defense against this variant of Internet forgery will rely on Internet governance and security automation. It is to inform the need for such solutions that we offer the findings below.

IDNs, Unicode, Punycode: What you need to know if you don't know

Internationalized Domain Names in Applications (abbreviated as simply IDNA, or instances thereof referred to as "IDNs") is a system to represent characters other than those found in Latin script/ASCII.

For example:

Chinese:	百度.中国
Farsi:		تهران.ایران
Russian:	яндекс.рф

This system was implemented in part to bridge the digital divide between English-speaking and non-English speaking users of the Internet. It enables the registration of domain names utilizing character sets of users' native languages.

IDNs solve the problem of how to express domain names in languages that cannot be represented in ASCII. So how are they encoded?


Unicode is the standard for digital representation of the characters used in writing all of the world's languages. It provides a unique number for every character. This numerical value representing a Unicode character (i.e.: U+03B1) is called a code point. The latest version of the Unicode standard contains 136,755 characters covering 139 modern and historic scripts.

Shown below are some Unicode characters and their respective code points (note that for ASCII values, the code points are identical to their common ASCII counterparts).

F:	U+0046
A:	U+0041
R:	U+0052
S:	U+0053
I:	U+0049
G:	U+0047
H:	U+0048
T:	U+0054
♥:	U+2665
✪:	U+272A

IDNs are bound in form and structure to Unicode. An important distinction to make is that Unicode itself is technically not an actual encoding format; it is just a massive lookup table. How characters are actually encoded into bits is handled by mechanisms like the Unicode Transformation Format (UTF)).


By now you may be ruminating on the fact that valid DNS names are (usually) limited to the case insensitive Letters, Digits, Hyphen (LDH) namespace, which in and of itself is completely unsuitable for representing anything other than ASCII names. It certainly cannot represent the often multi-byte Unicode character encodings, at least not without some serious help. Rather than attempting to expand the DNS alphabet, Internet engineers decided to use an ASCII Compatible Encoding (ACE) scheme to encode the Unicode data. This ACE is Punycode, a lossless method for converting Unicode into LDH ASCII. By convention, Punycode encoded IDNs will contain the xn-- prefix to herald the beginning of a Punycode encoded label.

Note that, as per section 3 of RFC 1034, "the DNS specification attempts to be as general as possible in the rules for constructing domain names". As such, technically, they can consist of octets of any value. However, it is generally "accepted" (and in many places, enforced), that only the LDH syntax is allowed.

Example Unicode to Punycode conversion:

αβγδεζηθικλμνξοπρστυφχψω --> xn--mxacdefghijklmnopqr0btuvwxy

IDNs ultimately represent Unicode labels and may appear as such to the end user, but over the wire, they are sent encoded using Punycode.

IDN Homographs

It's no secret that different letters or characters might look very much alike. Sometimes this comes about with changes in case or font when rendering text in the same language or script. Perhaps best known is the resemblance of lowercase "l" to uppercase "I" or the visual similarity between the letter "O" and the digit "0" - which gave way to the "slashed zero." The slashed zero is an instance of homoglyphic confusion being resolved earlier than the invention of the printing press. Characters from different alphabets or scripts may also appear indistinguishable from one another to the human eye. Individually, these "confusables" are known as homoglyphs, but in the context of the words that contain them, they constitute homographs. In this document we refer to them as "homographs," a less popular but more precise term for our purposes.

For example, consider the following domain names: --> fа
                                 This "CYRILLIC SMALL LETTER A" (U+0430) --> farsɩ
                                    This is a "LATIN SMALL LETTER IOTA" (U+0269) --> farsı
                                    This is "LATIN SMALL LETTER DOTLESS I" (U+131)          --> аррӏе.com
                                These are all Cyrillic characters (see the link below for more on this one)

When displayed as Unicode, these look "normal" to the casual observer. While all of the above examples of homographic domains are benign, many others are out there, and they may not be.

It should be noted that these are not new attacks. As a matter of fact, they have been known about for a long time. They have been a hazard on the Internet in the past, returning to the news recently.

New Farsight Security Research on IDNs

Our research into imposter domains that rely on IDN homograph attacks sheds light on the potential security risks posed by these bad actors to organizations and consumers. By leveraging IDN homographs, cybercriminals can easily lure users to phishing websites that are pixel-perfect renditions of the brands they're impersonating – often completely undetected by today's defender solutions.

Farsight Security recently engineered its infrastructure to monitor and watch for IDN homographs in real-time. In the appendix below you will find samples from an observational period of approximately three months from 2017-10-17 23:41 UTC to 2018-01-10 19:00 UTC. Specifically, we observed IDN homographs mimicking 125 top "phish-worthy" domains including large content providers, social networking giants, financial websites, luxury brands, cryptocurrency exchanges, and other popular websites. During this time period we observed a number of homographs – 116,113 to be exact.

Appendix B contains a small, manually curated dataset containing organized lists of IDN homographs. Next are three specific samples from this dataset that appear to be live phishing sites, presumably stood up with the intent of hoodwinking unsuspecting users.

Except where noted, Farsight Security makes no assumption of intent against any of the following domains or domain owners. Also except where noted, no attempt was made to verify that the owner of a domain was illegitimate or that any possible content was malicious. Indictment and take down are not within our purview. Also, please note that in all cases, attempts were made to contact each organization in advance of this publication. Finally, the following domains, while not hyperlinked in the original release of this article, are not de-fanged. Please take care not to inadvertently load one in a browser.

Appendix A: Suspected Phishing Sites

To add timeliness and relevance to this article and demonstrate that the threat of IDN homograph impersonation is both real and actively being exploited, we present two examples of what appear to be phishing websites. On the dates listed below, the following IDNs were observed to be serving content that looked suspiciously similar to the mainstream brands they were (allegedly) impersonating. While we can't comment on the ownership or the intent of the following websites, should you decide to visit these websites, please take extreme care.


Poloniex is a large cryptocurrency exchange with $1.5B USD in daily transaction volume. On 2018-01-09, we observed an IDN ( --> polonì that was serving content that looked very similar to the real Poloniex website (Figure 1).

Figure 1: Screenshot of a suspected Poloniex phishing website The website also featured a valid Comodo SSL certificate (Figure 2).

Figure 2: The SSL certificate of the suspected Poloniex phishing website

Ignoring the fact that this is an IDN homograph of a well-known US-based website, a user's first clue that something may be fishy here is the presumed misspelling of the phrase "Sign in" as "Sing in" (curiously, this typo occurs five times throughout the page). Additionally, frequent Poloniex users might recognize that the real Poloniex website does not immediately confront users with a "Sign in" or a "Sing in" page (it initially offers a welcome page encouraging new user sign-ups). Otherwise, the site is a reasonably good facsimile of the real Poloniex website that could easily bilk a user after deceiving them into making a login attempt.

No further attempt was made to investigate.


Facebook is the world's largest social networking platform, with over two billion monthly users. On 2018-01-09, we observed another IDN (www.ғасеьоок.com. --> that was also serving content apparently intended to misdirect users into attempting to log in (Figure 3). Perhaps fortuitously, this website's Let's Encrypt SSL certificate (not pictured) expired in November 2017 so it did not have a green padlock connoting "safety".

At the same time, we also found that a mobile-optimized version of the website had been deployed (m.ғасеьоок.com. --> (Figure 4).

Figure 3: Screenshot of a suspected Facebook phishing website

Figure 4: Screenshot of a suspected Facebook mobile phishing website

No further attempt was made to investigate either website.

Appendix B: Suspicious IDNs

The following are a subset of the IDNs we observed.

ADOBE                 -->        ns1.aɗ                 -->        ns2.aɗ               -->	 mail.adoḅ                     -->	 adobė.com.                    -->	 adoḅ                     -->	 aď                     -->	 ɑ

APPLE                -->        mail.à                -->        ns1.applẹ.com.                -->        ns2.applẹ.com.               -->        www.ɑƿƿ                  -->        www.â                 -->        www.âpplê.cf.                     -->        âpplê.cf.               -->        www.appɩė.com.                     -->        apþ                     -->        applę.com.                     -->        applė.com.               -->        www.ɑƿƿ

AMAZON               -->        www.amazoṇ.com.               -->        www.amazoṅ.com.                -->        www.amȧ                 -->        www.â                 -->        www.â                -->        www.á               -->        www.ämäzö                   -->        amaẓ                    -->        amaź                    -->        amazoñ.com.                   -->        amazoṅ.com.                    -->        amà                    -->        amȧ

BANK OF AMERICA         -->        www.baŋ        -->	 mail.bänkofämericä.com.      -->	 secure.baŋ         -->	 www.ƅ         -->	 www.baŋ         -->	 www.banĸ         -->	 www.bą         -->	 www.bɑnkofɑmericɑ.com.         -->	 www.bänkofämericä.com.             -->	 ƅ             -->	 baŋ             -->	 banĸ             -->	 bą             -->	 bɑnkofɑmericɑ.com.             -->        bänkofämericä.com.

BITTREX                   -->        bitţ                  -->        bittṛ                   -->        bittrè               -->        www.bitţ               -->        www.bittrè                   -->        ƅ               -->        www.ƅ                  -->        bíttŕē

CISCO                     -->        cı                     -->        cì                     -->        cí                     -->        сіѕсо.net.

COINBASE                  -->         cõ                  -->         cö                  -->         cò                  -->         cô                  -->         coiñ                 -->         coiṇ                  -->         coinbaş                  -->         coinbasè.com.                  -->         coinbasê.com.                  -->         coinbasė.com.                  -->         coinbä                  -->         coinbá                  -->         coì                  -->         coī                  -->         coî                  -->         coï                  -->         coı                  -->         ĉ                  -->         ç

CREDIT SUISSE               -->         cré               -->         cré              -->         cré               -->         cré               -->         cré               -->         cré              -->         cré              -->         cré              -->         cré             -->         cré              -->         cré              -->         cré             -->         cré             -->         credit-sü

EBAY                      -->         ê                      -->         ebá                      -->         ebà                      -->         ebɑ                      -->         ebâ                     -->         еьау.com.

FACEBOOK             -->         www.ḟ              -->         www.ƒ            -->         www.ḟaceḃ                 -->         ḟ                -->         ḟaceḃ                 -->         faċėbooķ.eu.                  -->         faċë                  -->         facėbooķ.eu.                  -->         facè                   -->         facê                   -->         facė                   -->         facebõ                   -->         faceboõ                   -->         facebò                   -->         facebô                  -->         facebooĸ.com.                  -->         facebooĸ.net.                 -->         faceḅ                  -->         façeboö                  -->         faċ                  -->         fącė                  -->         fà                 -->         fącebooķ.eu.              -->         www.faċė             -->         www.façė               -->         www.faceboõ              -->         www.facebö               -->         www.facebò               -->         www.faceboò           -->         www.faceḅọ              -->         www.faceƅ              -->         www.faċ               -->         www.fać               -->         www.fā              -->         m.ғасеьоок.com.                -->         ғасеьоок.com.

GOOGLE              -->        www.ǥooɡ               -->        ww25.gơ                    -->        gòò                  -->        goοglе.com.                    -->        goô                    -->        gö                    -->        goö                    -->        gò                    -->        goơ                    -->        googlę.com.                  -->        ǵooglé.com.                    -->        ġ                  -->        ǥooɡ               -->        www.gòòglè.com.                -->        www.gòò                -->        www.gò                -->        www.googlę.com.

KRAKEN                    -->	 кгакеп.com.                    -->	 krå                    -->	 ƙ                    -->	 ķ

MICROSOFT             -->        ww8.mı              -->        www.mí         -->	 windows.mí             -->	 ww8.mı            -->	 www.ṃ          -->	 www.mıcɾ           -->	 www.mícrosofť.com.              -->	 www.mì             -->	 www.mì              -->	 www.mì              -->	 www.mì             -->	 www.mī             -->	 www.mí              -->	 www.mí              -->	 www.mí              -->	 www.mí            -->	 www.mị             -->	 www.mî             -->	 www.mı              -->	 www.mı             -->	 www.mï             -->	 www.microsó             -->	 www.microsȯ            -->	 www.microsoḟ             -->	 www.microsoƒ             -->	 www.micró                -->	 ṃ              -->	 mıcɾ               -->	 mícrosofť.com.                  -->	 mì                 -->	 mì                  -->	 mì                  -->	 mì                 -->	 mī                 -->	 mí                  -->	 mí                  -->	 mí                -->	 mị                 -->	 mî                 -->	 mı                  -->	 mı                 -->	 mï                 -->	 micɾ                 -->	 microș                 -->	 microsó                 -->	 microsȯ                 -->	 microsofť.com.                -->	 microsofṭ.com.                -->	 microsoḟ                 -->	 microsoƒ                 -->	 micró

NETFLIX                   -->        ñ               -->        www.netflì               -->	 ns1.nê               -->	 ns2.nê               -->	 ww1.ñ              -->	 ww35.ñ               -->	 ww8.ñ               -->	 www.ñ               -->	 www.netflì               -->	 www.netflí                -->	 www.netflí               -->	 www.netflį               -->	 www.netflï               -->	 www.netflı               -->	 www.netƒ               -->	 www.né               -->	 www.nė               -->	 www.nê                   -->	 ñ                   -->	 netflì                   -->	 netflí                    -->	 netflí                   -->	 netflį                   -->	 netflï                   -->	 netflı                   -->	 netƒ                   -->	 né                   -->	 nė                   -->	 nê

NEW YORK TIMES                   -->	 nytí                   -->	 nytî                  -->	 nytỉ                   -->	 nytì                   -->	 ñ

POLONIEX                  -->       polonì                -->	ṗoloṇ                 -->	ṗ                 -->	pôloní                  -->	pó                 -->	polọ                  -->	poló                 -->	poloṇ                  -->	poloní                  -->	polonį                  -->	polonī                  -->	polonî                  -->	polonï                  -->	polonı                 -->	polonị                 -->	poloniẹ                  -->	poloniė                  -->	poł

TWITTER                -->        www.twittè                -->        www.twittê               -->        www.twittë               -->        www.twí                   -->        twî                   -->        twı                   -->        twìttè                 -->        тшіттея.com.

WALMART                   -->        wà                   -->        walmà                   -->        wä                   -->        wà                   -->        wá

WELLSFARGO                -->        wellsfargơ.com.                -->        wellsfargó.com.               -->        wellsfargọ.com.                -->        wellsfá

YAHOO                 -->        news.yahóó.es.                -->        news.yahö               -->        news.yahö                -->        news.yahoö.biz.                -->        news.yahò                 -->        news.yahó                 -->        news.yahoó.es.                -->        news.yahoó.org.                -->        news.yahöö.biz.               -->        news.yahöö.info.                -->        test.yahö                 -->        test.yahö                -->        test.yahoö.biz.               -->        test.yahoö.info.                -->        test.yahó                -->        test.yahoó.com.                 -->        test.yahoó.es.                -->        test.yahoó.org.                -->        test.yahoô.com.               -->        test.yahöö.info.                  -->        wp.yahóó.org.                  -->        wp.yahö                   -->        wp.yahö                  -->        wp.yahoö.biz.                   -->        wp.yahoö.de.                 -->        wp.yahoö.info.                  -->        wp.yahò                  -->        wp.yahoó.org.                  -->        wp.yahoô.com.                 -->        ww8.yahoô.com.                  -->        www.yahóó.es.                 -->        www.yahóó.org.                 -->        www.yahö                  -->        www.yahö                -->        www.yahö                 -->        www.yahoö.biz.                -->        www.yahoö.info.                 -->        www.yahò                 -->        www.yahoơ.com.                 -->        www.yahoó.com.                  -->        www.yahoó.es.                 -->        www.yahȯ                 -->        www.yahöö.biz.                -->        www.yahöö.info.                -->        www.yaḣ                 -->        www.yaħ                     -->        ý                      -->        yahóó.es.                     -->        yahö                    -->        yahö                    -->        yahoö.info.                     -->        yahò                     -->        yahoơ.com.                     -->        yahó                      -->        yahoó.es.                     -->        yahoó.org.                     -->        yahȯ                     -->        yahoô.com.                     -->        yahöö.biz.                    -->        yahöö.info.                     -->        yà                    -->        yä                   -->        yähö                   -->        yähoö.info.

WIKIPEDIA                 -->	 wiĸ                 -->	 wikipè                 -->	 wikipé                 -->	 wikipé                 -->	 wikipé                 -->	 wikipedì                 -->	 wikipedí                 -->	 wikí                 -->	 wıkipedı                 -->	 wí                 -->	 wï                 -->	 wíkí                 -->	 wıkı                 -->	 wìkì

YANDEX               -->        www.yandeẋ.com.               -->        www.yanḋ                -->        www.yɑ                  -->        yandeх                    -->        yą

YOUTUBE                   -->        yoű                   -->        youtubê.com.                  -->        youtuḇ              -->        ww11.yoù              -->        ww43.yoü               -->        www.yoü              -->        www.youtuḇ               -->        www.youț                   -->        ý              -->        www.ỳ                -->        www.ý

MISC: LUXURY BRANDS                 -->        www.guccì.com.                     -->        guccì.com.                    -->        hermè                 -->        www.hermè          -->        www.louí

MISC: SOCIAL PLATFORMS                 -->        ì                 -->        ı             -->        www.ı                 -->        iņ                 -->        www.imguŕ.com.                     -->        imgú                  -->        whatsá                  -->        whɑtsɑ

Appendix C: Protections

The following sections discuss some measures you can take as an indivdual or an organization to address the threats posed by IDN homographs.

How Can I Protect Myself?

As with many threats targeting users on the Internet, there is no silver bullet to help protect yourself. Vigilance is key, and all the rules for spotting traditional phishing sites still apply to IDN phishing sites as well.

Some web browsers support add-ons or extensions geared toward flagging or outright blocking IDNs. If the majority of your web browsing keeps you within the realm of traditional LDH ASCII domain names, this may be an acceptable security mechanism for you.

The majority of phishing attempts still reach users via email. Regardless of the apparent sender, be extremely suspicious of any emails that include:

  • Distressing or enticing statements to provoke an immediate reaction, or statements that threaten consequences if you fail to respond.

  • Account login links, especially when combined with requests or demands to update or confirm your information.

Of course many legitimate emails contain links to additional information. Instead of clicking on these links, try copying and pasting them into your browser. This can limit the exposure to embedded links with malicious URLs.

When using a web browser, enable phishing filters or the safe browsing feature if available, and keep an eye on the browser address bar:

  • Any site that requests you enter a password nowadays should utilize encryption. This means the URL should begin with "https://" instead of "http://", and most browsers will display some type of green padlock symbol or green highlighting of the address bar.

  • If the "s" at the end of "https://" is missing, or the address bar shows some type of red or orange warning, do not enter your password; further investigation is needed.

  • Be cautious if the address changes unexpectedly or if after clicking on a link, you are taken to an unfamiliar address.

  • Be familiar with how your browser handles IDNs. Chrome has an official page here, with links to related information for other popular browsers at the bottom of that page.

Finally, for all of the websites that support it, make sure you enable two-factor authentication (2FA). If your credentials do get phished, having 2FA enabled can provide an extra layer of security that can both alert you to a compromise of your credentials and prevent an attacker from logging in to your account. Note that even with 2FA enabled your cellphone may become the weakest link in the security chain. If your phone is used as a back-up device for resetting passwords, make sure you protect your cellular account with a strong pin-code (and hope the customer service agents are well trained enough to enforce its use for sensitive customer requests).

How Can I Protect My Organization?

If you operate a popular website that allows users to interact with one another, log in, purchase and/or download things, chances are your brand (and therefore your users!) will be on some target list for phishers and other Internet criminals.

You will want to pay attention to the IDN space, and either try to register IDN domain names proactively that could be used to impersonate your brand, or subscribe to a service that allows you to monitor recent IDN homograph registration and use in an attempt to impersonate your brand.

Updated: Thu Mar 15 16:13:12 UTC 2018, Special thanks to John C. Klensin for pointing out my colloquial misnomer about "rendering Unicode" (fonts are rendered, Unicode is displayed) and the clarification that technically any octet is allowed in a DNS label.

Technical credit for the research and development underlying the data referenced in this blog post is also shared with Stephen Watt.

Mike Schiffman is a Board Certified Vim Syntax Highlight Color Specialist for Farsight Security, Inc.