The Importance of Scale: A Critical Dimension To Keep In Mind When Benchmarking Passive DNS Options
By Joe St. Sauver
How Can I Benchmark Thee? Let Me Count The Ways
When evaluating or benchmarking providers of passive DNS service, we've seen people employ or describe a variety of different measures of interest, including:
- the number of responses received for test queries (more being better)
- the speed with which test queries get answered (faster being better)
- the usability of the passive DNS interface (streamlined, intuitive, adaptive, self-teaching, etc.)
- the breadth of DNS record types (
TXT, etc.) that are available
- the ability to time fence, limit by bailiwick, etc.
- and many more factors.
Maximum Number of Returned Results
One factor that does not get the attention it perhaps deserves is the ability of a passive DNS service to deliver results at scale. Scale MATTERS.
That is, if a given query results in a million hits, and you want or need that many, can you get them? Or is your output arbitrarily limited to just some subset of all known results, such as perhaps the first thousand or ten thousand? If you're limited to some low value, you may be facing big problems.
Incomplete Results? Complete Potential For Big Mistakes
If your passive DNS solution can't give you a complete answer, what little you do receive as an answer may result in serious errors.
For example, assume you only get a hundred results when you search for passive DNS data about an IP address. All 100 of those domains might look unquestionably bad, perhaps showing clear signs of being phishing-related, or all being DGA (domain generation algorithm)-related. As a result of that incomplete evidence you block that IP address.
Once you've done so, however, you learn that while those 100 domains were in fact bad, there were thousands of other innocent domains on that same IP, you just didn't get to see them – bummer for you, sorry about the collateral damage.
Naturally, the converse is also a possibility: the first 100 domains you see for an IP might look and be great, but you might have many others, unreported due to low limits on returned results, that get a "free pass" they really don't deserve. Sorry about those false negatives, bub.
Bottom line, you need the ability to get ALL the results needed for you to be able to make a fully informed choice. Partial results are like trying to safely drive a car when the windshield is blacked out and you can only see out the side windows!
Bad Guys Attempting to "Go Stealth" By Overwhelming Fixed Limits
If the number of results your passive DNS solution returns is limited to a low number, some miscreants may be tempted to intentionally leverage this reality in an attempt to hide their bad behaviors from investigation with passive DNS. Specifically, if the bad guys have more base domains or more FQDNs than can be displayed by some passive DNS systems, they've effectively guaranteed that they've just gone at least partially "stealth" – they will have domains that they can use, but which investigators may not be able to see.
Standing Out From The Crowd – When You're Trying To Hide?
This is not a riskless strategy on the bad guys' part. While they may be able to use a plethora of base domains or large numbers of randomized subdomains to overload some passive DNS systems, that strategy is a very "noisy" one, and one that's easily detectable if you've got a non-output-limited view of passive DNS.
In fact, you could even imagine a product that reports the domains with the largest number of unique FQDNs seen per day, hour, or other period of time. If you're a bad guy trying to hide, you wouldn't want to end up on such a list.
Not All Output Caps Are Bad: Output Limits As Basic Safety/Sanity Checks
We know that sometimes limits on output are deployed to "protect users from themselves." Without them, it can be easy to self-DOS oneself.
For example, passive DNS implementations that only have web interfaces often don't cope well with million-result responses: the user's browser or system can easily become overwhelmed and become slow or crash. In that case, limiting queries that might accidentally return an unexpectedly large number of results may actually be a self-defense measure. Farsight, in fact, limits its own DNSDB web interface for precisely that reason.
However, if you offer a command line interface (like Farsight's own
dnsdb_query), or an API like Farsight's that can be directly integrated into
your own custom code, those are the sort of options that can routinely cope
with million record response, and that's why Farsight allows users to adjust
the number of results that get returned, up to 1,000,000 per query for the
And obviously if you have Farsight's DNSDB Export ("on premises") product, the sky's the limit.
Eliminating the Need For Million Record Responses
Farsight also gives you the tools you need to keep your output to manageable size. At least in some cases, you may not need (or want) to see a million results.
Maybe you only want results for the last 90 days. Maybe you only want records of a particular type. When you're using Farsight's passive DNS system, you have the flexibility to ensure that you don't get responses that you don't need or want.
If you find yourself evaluating passive DNS systems, be sure you don't accidentally overlook a potentially critical factor: the ability to get ALL the results you may want and need.
Make sure your passive DNS system is mature enough to deliver ALL the results your queries may generate, even if that's a very large number. Be sure your passive DNS solution will scale to meet your analytical needs.
Joe St Sauver, Ph.D. is a Scientist with Farsight Security, Inc.