Coronavirus (COVID-19) Information Read here

← Farsight Blog

Using 0mq to Plumb a Simple Intermediate Layer for a DNSDB Client/Server Application

By

RSS

I. Introduction

While it is easy enough to write a client application that will directly access DNSDB API via https, imagine that you want to write an intermediate layer that will do some sort of pre-processing of DNSDB requests (or post-processing of DNSDB results) "in between" the client and DNSDB API. Conceptually:


Figure 1. Going Direct


Figure 2. Interposing An Intermediate Layer

What might an intermediate layer do? You might write code to pre-process queries going into DNSDB API, perhaps electing to:

  • Set (and then routinely apply) default options ("I always only want to see at most 2500 results from the last month")
  • Normalize internationalized names, changing Unicode names into the Punycode format that DNSDB needs
  • Build a Flexible Search "pipeline:" Given a supplied keyword, run Flexible Search to first find hits, then automatically "chase" those hits in DNSDB API Standard Search

Similarly, you might use an intermediate layer to post-process output from DNSDB API, too. For example, you could write code to:

  • Change the output's default time zone: Don't like UTC? You could code an intermediate layer to convert reported times from UTC to some preferred local time zone (such as US Eastern or US Pacific)
  • Produce summaries (How many "A" records did I find? how many AAAA records? how many CNAMEs? etc.)
  • Condense full hostnames down to just effective 2nd-level domains
  • Enrich results with ASN information, geolocation data, information from your favorite blocklists, etc.
  • Filter record types that you're sick of seeing in your output, or apply a "kill file" systematically supressing specific domains you've already evaluated and written off as irrelevant to your work.
  • Automatically execute comprehensive "offset queries:" run an initial query, then automatically run additional "offset" queries (if needed) to retrieve additional results, eventually returning up to four million consolidated and deduplicated results as a single "answer."

You could do simply many of these things in a single monolithic client application, but there may be advantages to decoupling some functions into a separate intermediate layer instead (even if that layer is just running on your own workstation):

  • Consistency: An intermediate layer can impose discipline on your analyses – depending on what you implement, you can make it so you'll never forget to apply local standards to queries you make as part of an investigation, no more forgetting to check for additional results, no more forgetting to time fence appropriately, etc.
  • Seamlessly Support Multiple Client Interfaces: Maybe you use a locally written GUI client for some runs, but a locally written command line interface for other runs. If you handle everything "client-side," any local improvements need to be implemented repeatedly in each of the client applications you're supporting. Wouldn't you rather abstract that functionality and just do that work once, in a common intermediate layer?
  • Modular Development: By breaking your processing into modular "chunks," you can develop and test new features on a chunk-by-chunk basis, reducing opportunities for complex (and potentially poorly-understood) interactions.

II. Access Control Considerations

If you do choose to create and run an intermediate layer, it's critically important that you implement strong access controls to prevent your intermediate layer from inadvertently acting as a "backdoor" or "proxy" that enables unauthorized access to DNSDB.

If you're just running your intermediate layer on your own dedicated workstation, a simple solution may be to simply run your application on localhost (e.g., IPv4 127.0.0.1) rather than binding your application to a routable IP address. That's the approach we'll illustrate in the code shown below.

If you need to run a more generally-available intermediate layer, you'll want to use strong authentication (perhaps something like PKI client certificate authentication) over a carefully-configured encrypted connection. The mechanics and complexity of appropriately doing that are beyond the scope of this article.

III. The Client-To-Intermediate-Layer Connection

There are many ways that one could create the necessary client-to-intermediate-layer connection, including simply by using standard Un*x sockets, but we wanted to use 0mq as implemented by pyzmq.

You may wonder "Why 0mq?" Well, 0mq is an intriguing alternative worthy of investigation for many reasons, including:

  • It's fast
  • It significantly reduces code complexity
  • It's broadly available (there are seemingly 0mq bindings for any language you might reasonably want to use)
  • It's lighter-weight than alternatives such as RabbitMQ
  • Recent versions of 0mq support appropriate cryptographic security (even though we won't be using crypto for this on-machine example, check out this example.
  • 0mq licensing is appropriate for a broad range of applications

There are also some limitations associated with 0mq that you should be thinking about, including:

  • The messages you pass via 0mq are built in memory. If you were to hypothetically create multi-GB messages to pass in 0mq, you'll want to make sure you have sufficient memory.
  • 0mq doesn't leverage crypto by default
  • 0mq represents another dependency, and we could (with more work) do without it for this example

On balance, however, we think 0mq makes sense for our example.

IV. The Intermediate-Layer to DNSDB API Connection

For the Intermediate layer to DNSDB API connection, we're simply going to use pycurl.

V. The Client

Because this is just a demonstration application, we're going to keep this bare bones. Here's our sample minimalist client:

$ cat simple-client.py
#!/usr/local/bin/python3
import sys
import zmq

myarg = sys.argv[1]

context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://127.0.0.1:5556")
socket.send_string(myarg)
result = socket.recv_string()
print (result)

Pretty short, isn't it? It just picks up the domain name to investigate from the command line, and then uses 0mq to send the query to the server we'll create next.

VI. The "Server"

Our "server" (actually, our intermediate layer application) is longer (because it uses libcurl to connect to DNSDB API, and makes at least a rudimentary pass at one error condition), but it's still less than a page of code:

$ cat simple-server.py
#!/usr/local/bin/python3
import zmq
from pathlib import Path
from io import BytesIO
import pycurl

def make_query(fqdn):
    filepath = str(Path.home()) + "/.dnsdb-apikey.txt"
    with open(filepath) as stream:
        myapikey = stream.read().rstrip()
    
    url = "https://api.dnsdb.info/dnsdb/v2/lookup/rrset/name/" + str(fqdn)

    requestHeader = []
    requestHeader.append('X-API-Key: ' + myapikey)
    requestHeader.append('Accept: application/jsonl')

    buffer = BytesIO()
    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.HTTPHEADER, requestHeader)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.perform()

    rc = c.getinfo(c.RESPONSE_CODE)
    body = buffer.getvalue()
    content = body.decode('iso-8859-1')
    if rc == 200:
        return content
    else:
        return str(rc)

def main():
    context = zmq.Context()
    socket = context.socket(zmq.REP)
    socket.bind("tcp://127.0.0.1:5556")

    while True:
        fqdn = socket.recv()
        fqdn2 = fqdn.decode("utf-8")
        # preprocess fqdn2 here if desired

        content = make_query(fqdn2)
        if content.isdigit():
            socket.send_string("Error making query! Return code = " + content)
        else:
            # postprocess content here if desired
            socket.send_string(content)

if __name__ == "__main__":
    main()

VII. Running The Example Application

To try the above demonstration application:

• Make both files executable:

     o	$ chmod a+rx simple-client.py
     o	$ chmod a+rx simple-server.py

• Using your favorite editor, save your DNSDB API key in ~/.dnsdb-apikey.txt

• Make sure it's not readable by others:

     o	$ chmod go-rwx ~/.dnsdb-apikey.txt 

• In one terminal window, run the server by saying:

     o	$ python3 simple-server.py
  • In another terminal window, try making a sample query (output shown below is manually wrapped for ease of reading):
    o	$ python3 simple-client.py www.whitman.edu/A
   	{"cond":"begin"}
        {"obj":{"count":1299072,"time_first":1395869220,"time_last":1603327211,
    	"rrname":"www.whitman.edu.","rrtype":"A","bailiwick":"whitman.edu.",
    	"rdata":["199.89.174.11"]}}
    	{"obj":{"count":940920,"time_first":1277387381,"time_last":1395938425,
    	"rrname":"www.whitman.edu.","rrtype":"A","bailiwick":"whitman.edu.",
    	"rdata":["199.89.174.13"]}}
    	{"cond":"succeeded"}
  • If you want to make another query, just run simple-client.py again – the server will continue to run until you kill it by hitting control-C in the window where the server's running.

VIII. Limitations And Obvious Next Steps

The sample application shown above truly is just a skeleton, particularly when it comes to handling exceptions and being hardened against various attacks – it's NOT meant to be "sailor proof," it's just a proof of concept.

That said, primitive as this brief example may be, it's still quite a powerful one – you now have a working framework that you can easily enhance to prototype the sort of things we discussed in Section I.

For part two of this series, let's try building a DNSDB Flexible Search regular expression enrichment pipeline and a global domain "kill file" that we can use to do filtering of our DNSDB output prior to delivery. We'll also handle displaying our output as something easier-to-read than just a blob of raw JSON Lines with times expressed in Un*x ticks.

Joe St Sauver is a Distinguished Scientist and Director of Research with Farsight Security, Inc..