Bulk Converting Internationalized Domain Names to Punycode With Perl and Net::IDN::Encode for Use With DNSDB



I. Introduction

There has been a lot of excitement around Internationalized Domain Names (IDNs) among those whose native languages are built on non-Latin character sets. Unfortunately, cybercriminals have also embraced IDNs as documented in the excellent report by Farsight's own Mike Schiffman entitled, Farsight Security Global Internationalized Domain Name Homograph Report.

Browsers (such as Firefox) are able to accept domain names in either IDN native characters or in punycoded-format, converting between the two formats automatically (see Firefox and Chrome).

DNSDB, Farsight's passive DNS system, however, normally expects to see punycoded domain names.

This means that if you try looking up the sample random domain регнум.рф in DNSDB with dnsdbq, nothing will be found:

$ dnsdbq -r регнум.рф/A -S -k last
libcurl: 404 (https://api.dnsdb.info/lookup/rrset/name/%D1%80%D0%B5%D0%B3%D0%BD%D1%83%D0%BC.%D1%80%D1%84/A)
please note: 404 usually just means that no records matched the search criteria
API: Error: no results found for query.

However, if you try again with the punycoded version of that name, you will find hits:

$ dnsdbq -r xn--c1adwdmv.xn--p1ai/A -S -k last
;; record times: 2015-09-02 11:49:23 .. 2018-08-03 03:18:58
;; count: 36891; bailiwick: xn--c1adwdmv.xn--p1ai.
xn--c1adwdmv.xn--p1ai.  A

;; record times: 2015-08-20 05:54:09 .. 2015-09-02 10:14:58
;; count: 116; bailiwick: xn--c1adwdmv.xn--p1ai.
xn--c1adwdmv.xn--p1ai.  A

Clearly, if you're querying DNSDB for internationalized domain names, you'll want to be sure you're using the punycoded format of any internationalized domain names you're investigating.

There are many websites that will convert onesie-twosie IDNs to punycode for you (such as Punycoder).

You can also use the IDN command (see GNU IDN Library):

$ idn --quiet регнум.рф

However, if you have a large file full of internationalized domain names that you'd like to convert locally to punycode, it can also be interesting to see how easily perl can be used with the Net::IDN::Encode module for this purpose.  

II. Converting Internationalized Domain Names to Punycode With perl and Net::IDN::Encode

Programmatically converting domain names can be done with many different languages and libraries, but for this example, let's use perl with Net::IDN::Encode. If you need to install that module and you have cpanm available, try:

# cpanm Net::IDN::Encode

The required code is quite short:

$ cat convert-idn-to-punycode.pl
use Net::IDN::Encode ':all';
use open ':std', ':encoding(UTF-8)';
foreach $line ( <STDIN> ) {
   chomp ( $line );
   my $a = domain_to_ascii( $line );
   print "$a\n";

Be sure that script has the correct permissions to be executed:

$ chmod a+rx convert-idn-to-punycode.pl

To test the converter, create a file called test.txt with the encoded names you want to convert:

$ cat test.txt

Now run that file through the converter:

$ ./convert-idn-to-punycode.pl < test.txt

We can cross-check the output from that program against what we see from the idn command:

$ idn --quiet < test.txt

The output looks consistent.

The output from the little perl script (or from the idn command) will give us what we need to search for those domains in DNSDB. For example:

$ dnsdbq -r xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c/A -S -k last
;; record times: 2018-03-27 01:05:53 .. 2018-07-16 23:26:59
;; count: 19; bailiwick: xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c.
xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c.  A

;; record times: 2017-12-21 12:48:43 .. 2018-01-26 08:49:41
;; count: 7; bailiwick: xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c.
xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c.  A

III. Conclusion

We hope that you find this little script to be a useful tool as you find yourself working with IDNs in DNSDB.

To get started with Farsight Security's products, please visit here.

Joe St Sauver Ph.D. is a Distinguished Scientist for Farsight Security, Inc.