Newly Observed Hostnames
By Joe St. Sauver
In an environment as large and complex as the Internet, it's difficult to keep track of things that are new.
Today, we're going to talk about something that might seem almost impossible, and that's keeping track of new Internet-visible hostnames – on a hostname-by-hostname basis – Internet-wide.
Before we do that, however, let's quickly recap an earlier Farsight Security, Inc. (FSI) product, Newly Observed Domains, or "NOD," so you'll have a foundation for understanding both it and our new product.
Newly Observed DOMAINS
A year ago, FSI announced our highly-popular Newly Observed Domains (NOD) product. That channel consists of a near-real-time stream of newly-observed 2nd-level domain names ("com" is a "base domain name," and "example.com" would be considered a "2nd-level domain name.")
By monitoring that channel, subscribers could learn about new 2nd-level domain names virtually as soon as they began to be used. That visibility was (and still is) useful for a variety of purposes including but not limited to:
- automatically blocking spam and potentially-malicious newly-created domains for a number of hours (e.g., until domain name reputation systems have had a chance to catch up),
- brand protection work,
- corporate intelligence,
- detection of DGA-related botnet activity, and
- search engine seeding
Some may have (incorrectly) assumed that NOD was merely an aggregation of publicly-available Zone File Access program (ZFA) data. It's not. The core of NOD consists of data from actual live monitoring performed by FSI's worldwide network of over 500 sensor nodes.
Thus, NOD includes results for newly observed domains from TLDs which don't offer ZFA programs at all, and from 2nd level-domains created above effective-TLDs (as defined by the Mozilla Public Suffix list), as well as newly observed domains from conventional gTLDs and ccTLDs. It's truly a unique resource, and an impressive accomplishment in its own right. That said, we hope you'll agree that our new product is even cooler.
Newly Observed HOSTNAMES
Earlier this month at BlackHat Las Vegas – one of the world's largest and longest-running system and network security conferences – FSI announced our Newly Observed HOSTNAMES (NOH) channel.
Newly Observed Hostnames compliments our existing Newly Observed Domains channel by "zooming in" and tracking the creation of individual Internet-visible hostnames (e.g., "fully qualified domain names" or FQDNs) on a hostname-by-hostname basis worldwide.
Conceptually, you can think of NOH as taking a raw stream of DNS hostnames (as seen at Farsight's Security Information Exchange (SIE), and then screening each of those hostnames against our Passive DNS database (DNSBD), reporting only those that haven't previously been seen by any of our over 500 sensor nodes.
A typical Newly Observed Hostname entry (in presentation format) looks like:
domain: mutualofomaha.com. <-- base domain time_seen: 2015-08-12 17:10:14 bailiwick: mutualofomaha.com. rrname: bgzpzftw.mutualofomaha.com. <-- full hostname rrclass: IN (1) rrtype: A (1) rdata: 220.127.116.11
Averaged over the day, typically there will be about ninety Newly Observed Hostnames per second (versus just one or two Newly Observed Domains per second). Newly Observed Hostnames typically consumes about a 100 Kbps worth of network bandwidth. An average newly-observed hostname is published about 150 seconds from the time of its first observation.
So now that you know what NOH is, let's look at some use-cases.
The Need For Newly Observed Hostnames: PHISHING
Some may wonder why anyone might want or need a feed of Newly Observed Hostnames. To understand that, it may help to think about the sort of domain names that routinely show up in conjunction with abuse. For example, consider phishing domains.
PhishTank is one of several sites that lists user-reported phishing URLs. When I checked that site on August 11th, 2015 it included a variety of apparently PayPal-related URLs such as the following (de-fanged here to prevent anyone from accidentally visiting these URLs, and to keep this article from potentially making any domain reputation systems twitch):
Note that those domains are the sort of domains that could be detected from a NOD feed: the potential phishing-related signature content, bolded above, is an integral part of the 2nd-level domain name (which is what you'd see in NOD).
But now consider some other domain names, also apparently PayPal-related, also from PhishTank:
If we reduce those URLs to just their 2nd-level domains (e.g., just the bit that's shown in bold above), there's nothing inherently suspicious about those 2nd-level domain names. They just appear to be regular domains.
It is only when we have the ability to see the full hostname, as you can in Farsight's Newly Observed HOSTNAMES, that we can see potential phishing-related patterns that would likely trigger further review and action by anti-phishing specialists.
Rapid identification of suspicious hostnames, as enabled by Newly Observed Hostnames, translates to quicker takedowns, lower levels of phished accounts and financial losses, and thus happier banks, payment card companies, and other financial businesses – to say nothing of their customers.
The Need for Newly Observed Domains: BRAND PROTECTION
Another example of how it can be critical to have visibility into complete hostnames (rather than "merely" 2nd-level domain names), can be seen in the brand enforcement area. Assume, for example that you're interested in any/all domains that include either the trademarked name "rolex" or the trademarked name "gucci".
One easy way to watch the Newly Observed Hostnames for hosts containing either of those words is to rent a blade server from Farsight at the Security Information Exchange, subscribing to the Newly Observed Hosts channel. Once you've done that, you can simply say:
$ nmsgtool -C ch213 | grep "rolex\|gucci"
The first part of that command pipeline,
nmsgtool -C ch213, will get traffic
from the Newly Observed Hostname channel (Channel 213), while the second half
of that command,
grep "rolex\|gucci", will match and print any records with
either of our matching strings of interest, even if those strings appear buried
in the middle of a hostname.
Rather not work on a blade server at SIE? You can also use SIE Remote Access ("SRA") to securely tunnel ch213 traffic back to your own location, if that's more convenient for your needs.
We can also deliver rolling hourly snapshots of NOH in CSV format as another option.
These simple approaches to mining Newly Observed Hostnames may be all that customers need or want to process that data stream. Other Newly Observed Hostname customers may want to take advantage of our API to tightly integrate FSI's Newly Observed Hostname data with their own code.
The Need for Newly Observed Hostnames: IDENTIFICATION OF WILDCARDED DOMAIN NAMES
Another phenomena that you can easily identify in our Newly Observed Hostnames channel is wildcarded domain names. Wildcarded domain names can be used for many different purposes, ranging from marketers attempting to track individual responses to one of their solicitations, to less savory types trying to "stay under the radar" by avoiding any single hostname showing up as running "too hot."
Of course, if you're watching Newly Observed Hostnames, wildcarded domain names represent a phenomena that's pretty hard to miss.
For example, taking a 50,000 observation sample of Newly Observed Hostnames (less than 10 minutes worth of data), we can easily find hostnames such as the following (note that the select hostnames shown here are just a small subset of all matching hostnames, and the names shown have been de-fanged, partially-redacted, and are shown here in reversed, easily sorted format):
If we use
dig to manually test
other arbitrary potential hostnames in the kddflk[dot]com domain, we see that
the nameservers for that domain will "answer" for any arbitrary hostname we
specify for that domain. This is behavior consistent with a wildcarded domain
kddflk[dot]com resolves to
Checking Passive DNS for
158[dot]58[dot]173[dot]5, we see that over a
kddflk[dot]com wildcard domain names exist on that IP alone.
If we further check
kddflk[dot]com, we see that:
- The domain name's whois point of contact information is hidden behind a privacy protection service (not a "smoking gun" in and of itself, but an often-relevant indicator)
- It's listed on the SURBL domain block list,
- It has a negative reputation assigned by MyWOT, and
- It's listed on Fortiguard as having had multiple malware infections over a period of years.
Our point in drawing your attention to a wildcarded domain of this sort is to make the point that wildcarded domains are often (although certainly not always) an example of a phenomena of interest to security staff.
With a subscription to Farsight Security's Newly Observed HOSTNAMES channel, you can easily identify wildcarded domain names.
The Need for Newly Observed Hostnames: ENTERPRISE/AGENCY SECURITY AWARENESS
Most of the Newly Observed Hostnames that Farsight sees are fully qualified domain names that are intentionally made available to the Internet.
Occasionally, however, an enterprise or government agency might have internal hosts that have "inadvertently" become Internet visible.
These intranet-only hosts may then end up in FSI's Newly Observed Hostnames channel.
Simply knowing that those hostnames (hostnames which were never meant to "see the light of the Internet") are now being publicly resolved can be a real warning bell/wake up call that a review and corrective reconfiguration may be needed ASAP.
The Need for Newly Observed Hostnames: ACADEMIC RESEARCH
Another example of a niche for Newly Observed Hostnames is in the academic research area, particularly labs that may be focused on measuring Internet-related phenomena such as adoption of new gTLDs, uptake of IPv6, Internet mapping, etc.
Getting More Information About Newly Observed Hostnames
Joe St Sauver, Ph.D. is a Scientist with Farsight Security, Inc.