Bulk Converting Internationalized Domain Names to Punycode With Perl and Net::IDN::Encode for Use With DNSDB
By Joe St Sauver
There has been a lot of excitement around Internationalized Domain Names (IDNs) among those whose native languages are built on non-Latin character sets. Unfortunately, cybercriminals have also embraced IDNs as documented in the excellent report by Farsight's own Mike Schiffman entitled, Farsight Security Global Internationalized Domain Name Homograph Report.
DNSDB, Farsight's passive DNS system, however, normally expects to see punycoded domain names.
This means that if you try looking up the sample random domain регнум.рф in DNSDB with dnsdbq, nothing will be found:
$ dnsdbq -r регнум.рф/A -S -k last libcurl: 404 (https://api.dnsdb.info/lookup/rrset/name/%D1%80%D0%B5%D0%B3%D0%BD%D1%83%D0%BC.%D1%80%D1%84/A) please note: 404 usually just means that no records matched the search criteria API: Error: no results found for query.
However, if you try again with the punycoded version of that name, you will find hits:
$ dnsdbq -r xn--c1adwdmv.xn--p1ai/A -S -k last ;; record times: 2015-09-02 11:49:23 .. 2018-08-03 03:18:58 ;; count: 36891; bailiwick: xn--c1adwdmv.xn--p1ai. xn--c1adwdmv.xn--p1ai. A 126.96.36.199 ;; record times: 2015-08-20 05:54:09 .. 2015-09-02 10:14:58 ;; count: 116; bailiwick: xn--c1adwdmv.xn--p1ai. xn--c1adwdmv.xn--p1ai. A 188.8.131.52 [etc]
Clearly, if you're querying DNSDB for internationalized domain names, you'll want to be sure you're using the punycoded format of any internationalized domain names you're investigating.
There are many websites that will convert onesie-twosie IDNs to punycode for you (such as Punycoder).
You can also use the IDN command (see GNU IDN Library):
$ idn --quiet регнум.рф xn--c1adwdmv.xn--p1ai
However, if you have a large file full of internationalized domain names that you'd like to convert locally to punycode, it can also be interesting to see how easily perl can be used with the
Net::IDN::Encode module for this purpose.
II. Converting Internationalized Domain Names to Punycode With perl and Net::IDN::Encode
Programmatically converting domain names can be done with many different languages and libraries, but for this example, let's use perl with
Net::IDN::Encode. If you need to install that module and you have
cpanm available, try:
# cpanm Net::IDN::Encode
The required code is quite short:
$ cat convert-idn-to-punycode.pl
Be sure that script has the correct permissions to be executed:
$ chmod a+rx convert-idn-to-punycode.pl
To test the converter, create a file called test.txt with the encoded names you want to convert:
$ cat test.txt 割草机.企业 แฟชั่น.ไทย 달인식자재마트.닷컴
Now run that file through the converter:
$ ./convert-idn-to-punycode.pl < test.txt xn--ner991cxpu.xn--vhquv xn--b3c4aq7eud0b.xn--o3cw4h xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c
We can cross-check the output from that program against what we see from the idn command:
$ idn --quiet < test.txt xn--ner991cxpu.xn--vhquv xn--b3c4aq7eud0b.xn--o3cw4h xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c
The output looks consistent.
The output from the little perl script (or from the idn command) will give us what we need to search for those domains in DNSDB. For example:
$ dnsdbq -r xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c/A -S -k last ;; record times: 2018-03-27 01:05:53 .. 2018-07-16 23:26:59 ;; count: 19; bailiwick: xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c. xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c. A 184.108.40.206 ;; record times: 2017-12-21 12:48:43 .. 2018-01-26 08:49:41 ;; count: 7; bailiwick: xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c. xn--bk1b33kjpiz0eqd4gw50f.xn--mk1bu44c. A 220.127.116.11
We hope that you find this little script to be a useful tool as you find yourself working with IDNs in DNSDB.
To get started with Farsight Security's products, please visit here.
Joe St Sauver Ph.D. is a Distinguished Scientist for Farsight Security, Inc.