Limiting DNSDB Results: dnsdbq little ell vs big ell
By Joe St Sauver
dnsdbq is Farsight's popular command line client interface to Farsight's DNSDB and other passive DNS systems. It is available in easy-to-build-from-source code form from https://github.com/dnsdb/dnsdbq.
Some of dnsdbq's features may initially seem complex or confusing. For example, why does dnsdbq have both a little ell option (-l) and a big ell (-L) option for limiting responses? The manual page for dnsdbq describes both:
-l query_limit query for that limit's number of responses. If specified as 0 then the DNSDB API server will return the maximum limit of results allowed. If -l, is not specified, then the query will not specify a limit, and the DNSDB API server may use its default limit. [...] -L output_limit clamps the number of objects per response (under -[R|r|N|n|i|f]) or for all responses (under -[fm|ff|ffm]) output to output_limit. If unset, and if batch and merge modes have not been selected with the -f and -m options, then the -L output limit defaults to the -l limit's value. Otherwise the default is no output limit.
Those may superficially seem quite similar (they're both limiting what we end up getting, right?), but, in fact, there are important differences.
II. Experimenting With Little Ell and Big Ell
For example, let's assume we want to look at ALL the results for an RRname query for www.uoregon.edu/CNAME, sorted in descending order. We see results that look like:
$ dnsdbq -r www.uoregon.edu/CNAME -S -k last ;; record times: 2019-02-22T04:08:40Z .. 2021-03-22T14:30:51Z (~2y ~29d) ;; count: 847635; bailiwick: uoregon.edu. www.uoregon.edu. CNAME drupal-hosting-web-cluster5-prod.uoregon.edu. ;; record times: 2019-02-22T01:10:03Z .. 2019-02-22T04:06:06Z (2h 56m 4s) ;; count: 146; bailiwick: uoregon.edu. www.uoregon.edu. CNAME drupal-hosting-web-cluster5.uoregon.edu. ;; record times: 2014-12-29T16:09:30Z .. 2019-02-22T01:01:40Z (~4y ~55d) ;; count: 1525954; bailiwick: uoregon.edu. www.uoregon.edu. CNAME drupal-cluster5.uoregon.edu. ;; record times: 2013-09-12T14:44:09Z .. 2014-12-29T16:14:57Z (~1y ~108d) ;; count: 1002955; bailiwick: uoregon.edu. www.uoregon.edu. CNAME wc-www.uoregon.edu. ;; record times: 2010-10-19T12:12:39Z .. 2013-09-12T14:43:56Z (~2y ~329d) ;; count: 1924809; bailiwick: uoregon.edu. www.uoregon.edu. CNAME uowc-www.uoregon.edu.
Now assume that we want to keep just the most recent result, perhaps for use in an example in some documentation. We MISTAKENLY attempt to get that result by adding dash little ell one:
$ dnsdbq -r www.uoregon.edu/CNAME -S -k last -l1 Query limited: Result limit reached ;; record times: 2013-09-12T14:44:09Z .. 2014-12-29T16:14:57Z (~1y ~108d) ;; count: 1002955; bailiwick: uoregon.edu. www.uoregon.edu. CNAME wc-www.uoregon.edu.
Hmm. That's not the result we expected! We wanted the MOST RECENT result, but actually get the 2nd-to-oldest result instead.
So what do we see if we use big ell instead of little ell?
$ dnsdbq -r www.uoregon.edu/CNAME -S -k last -L1 ;; record times: 2019-02-22T04:08:40Z .. 2021-03-22T14:30:51Z (~2y ~29d) ;; count: 847635; bailiwick: uoregon.edu. www.uoregon.edu. CNAME drupal-hosting-web-cluster5-prod.uoregon.edu.
There we go! That's what we wanted! So what was the difference? Simple:
- The little ell option (-l) limits the number of results returned from the DNSDB API server, first, and then works on whatever else you wanted done "client side" (such as sorting the results). (Little ell avoids retrieving "unwanted" results "up front.")
- The big ell option (-L) on the other hand, applies its limit as the last thing dnsdbq does (after all sorting or other "client side" "magic" is done). This option merely controls what gets output, it does NOT attempt to prevent "unwanted" data from getting retrieved in the first place.
At this point you might be tempted to (wrongly) say, "Well, I guess all I ever need is "big ell" then, eh? No. In point of fact, BOTH little ell and big ell play important roles.
For example, by default the results returned by DNSDB API are limited to 10,000 results. If you want or need more than that number, you need little ell to be able to ask for 100,000 or 500,000 or the maximum (a million) results, instead.
Big ell also plays a particularly important role when it comes to batched queries.
III. Little Ell, Big Ell, And Batched Queries
Normally, dnsdbq runs one query at a time. However, dnsdbq can also process a "batch" of queries using the -f option. In fact, we'll often use -f with -m to run up to 10 queries in parallel.
For example, perhaps you have a file of queries called ous.txt that you want to run, containing the lines:
$OPTIONS -l0 -L5000000 rrset/name/*.uoregon.edu rrset/name/*.oregonstate.edu rrset/name/*.pdx.edu rrset/name/*.eou.edu rrset/name/*.oit.edu rrset/name/*.sou.edu rrset/name/*.wou.edu
You could run those in "batch mode" by saying:
$ dnsdbq -fm < ous.txt > ous.output
Note that because we're using -fm mode, the queries will be run concurrently ("in parallel") with the output from all the queries interleaved. We can set our options on the command line, or in the batch file itself, as we've done in this example. In "batch mode:"
Little ell establishes limits that you want to apply to EACH query in that batch
Big ell establishes limits that pertain to the COMBINED OUTPUT from all the queries in the batch run.
Because batch mode is so powerful, big ell can serve as a nice "safety switch" protecting you against accidentally doing something crazy, like asking for queries that can return (literally!) billions of results in aggregate!
A nice summary of little ell vs. big ell was provided by one of the authors of these feature:
-l will be applied to each query in a batch. -L will be applied to the combined output of all those queries.
-l is processed server side, -L is processed client side.
-l happens before sorting, -L happens after sorting.
-l limits the size of (each) answer and prevents useless data from being sent.-L doesn't stop the full answer from being transmitted; dnsdbq just stops printing after the limit is reached.
We hope you find these features a useful addition to your dnsdbq analytic "arsenal."
Joe St Sauver is a Distinguished Scientist and Director of Research for Farsight Security, Inc.