Farsight's Network Message, Volume 2: Introduction to nmsgtool




This article is the second in a multi-part blog series intended to introduce and acquaint the user with Farsight Security's NMSG suite. This article is an introduction to nmsgtool and provides several useful recipes and examples.

Before reading this article, it is recommended that you read Farsight's Network Message, Volume 1: Introduction to NMSG and have a local installation of nmsgtool. To get the most from this article and be able to run all of the examples, an account on Farsight Security's Security Information Exchange (SIE) is recommended. If you don't have SIE, you can order it here. This article covers nmsgtool version 0.9.1.

What is nmsgtool?

To paraphrase the Unix manpage, nmsgtool is the command line interface to libnmsg and is a thin wrapper around libnmsg's I/O engine. It controls the transmission, storage, creation, and conversion of NMSG payloads.

NMSG Inputs and Outputs

The nmsgtool program is a single tool for taking inputs from a variety of different inputs like data streams from the network, capturing data from network interfaces, reading data from files or even standard input and making NMSG payloads available to one or more outputs. The outputs are files in binary or human-readable (ASCII presentation) form, or binary payloads to network sockets for transport. Without having to create a program for each function, nmsgtool handles all sorts of data processing including serialization, fragmentation, compression, striping or mirroring, rolling file outputs, and executing data processing programs on file outputs.

nmsgtool inputs can take the following forms:

  • A file containing binary NMSG data (i.e.: the output of a previously instantiated nmsgtool command)
  • A socket that is plumbed to contain binary NMSG data
  • Reassembled IP datagrams from a pcap file
  • Reassembled IP datagrams from a network interface
  • A file containing ASCII presentation data

nmsgtool outputs can take the following forms:

  • Binary NMSG data to a file
  • Binary NMSG data to a network socket
  • ASCII presentation form data to a file (including stdout)

You can specify more than one of each.

nmsgtool Recipes

The following are a handful of useful nmsgtool recipes intended to showcase its functionality and demonstrate different ways to use the tool.

Read Data from SIE

A common use case for SIE customers is to use nmsgtool to read live SIE data directly from the wire and write the output to the screen.

    $ nmsgtool -C ch212 -c 1 -o -
    [72] [2015-02-03 13:35:29.678474903] [2:5 SIE newdomain] [a1ba02cf] [] []
    domain: s47rbh.xyz.
    time_seen: 2015-02-03 13:33:20
    rrname: s47rbh.xyz.
    rrclass: IN (1)
    rrtype: NS (2)
    rdata: ns1.51dns.com.
    rdata: ns2.51dns.com.

The above invocation reads a single NMSG payload [-c 1] from SIE Channel 212 (Newly Observed Domains) [-C ch212] and emits it to stdout as ASCII presentation data [-o -].

Note that if no outputs are specified, ASCII presentation to stdout [-o -] is the default behavior of nmsgtool. In future examples, it will be omitted.

Behind the scenes nmsgtool uses a configuration file called nmsgtool.chalias that contains channel number to IP address/UDP port mappings. When a channel number is specified on the command line, nmsgtool looks it up in the nmsgtool.chalias file and listens on the specified network socket.

The captured NMSG datagram header is the emitted first. Breaking this down we have the following individual fields:

  • [72]: The message size in bytes
  • [2015-02-03 13:35:29.678474903]: A UTC timestamp with nanosecond resolution
  • [2:5 SIE newdomain]: Vendor and message ID, vendor and message type
  • [a1ba02cf]: The source identifier (optional)
  • []: The operator code (optional)
  • []: The group code (optional)

The message payload is a combination of key-value pairs. They follow a schema defined by the vendor and message type (in the above example, the vendor is SIE and the message type is newdomain). nmsgtool includes dynamically loadable modules that enable it to present this data as you see above and also enable NMSG-based programs or scripts load the key-value pairs into structures. These concepts will be more rigorously explained in future NMSG articles.

Read and Write Binary NMSG Files

Another common use case for SIE customers is to read messages from SIE into a local binary file for later analysis.

    $ nmsgtool -C ch208 -c 100000 -w ch208.nmsg
    $ stat -c "%n %s" ch208.nmsg
    ch208.nmsg 15246581
    $ nmsgtool -r ch208.nmsg -c 1
    [72] [2015-02-01 00:07:53.596907788] [2:1 SIE dnsdedupe] [a1ba02cf] [] [] 
    type: EXPIRATION
    count: 2
    time_first: 2015-01-31 07:29:37
    time_last: 2015-01-31 07:29:37
    bailiwick: <redacted>
    rrname: <redacted>
    rrclass: IN (1)
    rrtype: A (1)
    rrttl: 43200
    rdata: <redacted>

The above invocation reads reads 100,000 NMSG payloads [-c 1000000] from SIE Channel 208 (Passive DNS, deduplicated, verified, in-bailiwick) [-C ch208] and emits them as binary NMSGs to a file [-w ch208.nmsg]. The result is 15 megabytes.

That binary file is then read [-r ch208.nmsg] and a single NMSG payload [-c 1] containing a dnsdedupe message is emitted to stdout as ASCII presentation data.

Output Compression

nmsgtool can compress binary payload output (to a file or for emission across the network) using zlib compression (the same algorithm used by the ubiquitous gzip tool). To see an example of the on-disk storage benefit compression can offer, we compress the data captured in the previous recipe.

    $ nmsgtool -r ch208.nmsg -w ch208z.nmsg -z
    $ stat -c "%n %s" ch208-z.nmsg
    ch208-z.nmsg 6428829

The above invocation reads the binary file from the previous example [-r ch208] and writes a new file [-w ch208z.nmsg], compressing each payload [-z]. The resultant file is just over six megabytes for a 58% decrease in file size. It is important to note that the compression is performed per payload, not across the entire file.

Kicker Scripts and Output File Rolling

Another useful feature nmsgtool offers is the ability to perform automatic file rolling (rotation) based on timer expiry or payload count. Additionally, the user can specify a kicker command to run on output files after rotation.

Consider the following simple shell script:

    echo "$1: " `nmsgtool -r $1 | grep "[2:1 SIE dnsdedupe]" | wc -l`

The above script, count.sh, counts the number of dnsdedupe payloads from a binary NMSG file. The script is invoked in the following example:

    $ nmsgtool -C ch202 -w ch202 -t 2 -z -k count.sh 
    ./ch202.20150202.0110.1422839406.364843292.nmsg: 3099450
    ./ch202.20150202.0110.1422839408.013136741.nmsg: 3384114
    ./ch202.20150202.0110.1422839410.024261700.nmsg: 3090827
    ./ch202.20150202.0110.1422839412.024284315.nmsg: 3100505
    ./ch202.20150202.0110.1422839414.033887391.nmsg: 3026627
    ./ch202.20150202.0110.1422839416.014162500.nmsg: 3208181

The above invocation reads payloads from SIE Channel 202 (Raw Passive DNS) [-C ch202] and writes compressed payloads [-z] to a binary file [-w ch202]. Every two seconds [-t 2] the file is closed, rotated, and the kicker script is run on the output file [-k count.sh]. The output from each count.sh invocation is the filename followed by the number of NMSG payloads.

Transfer NMSGs across the network

nmsgtool can be used to transfer NMSGs across an IPv4 or IPv6 network to either a unicast or broadcast address.

For this example, we instantiate two nmsgtool sessions on separate hosts. On the receiving host, we run nmsgtool as follows:

    $ nmsgtool -l

The above invocation listens for NMSGS on a network socket connected to on UDP port 9430 [-l]. When NMSGs appear, they will be emitted as ASCII presentation data to stdout.

On the sending host, we run nmsgtool as follows:

    $ nmsgtool -r ch202.20150202.0110.1422839406.364843292.nmsg -c 2 -s

The above invocation reads two payloads [-c 2] from the binary NMSG file created in the previous example [-r ch202...]. They are written to the network destined for on UDP port 9430 [-s].

On the receiving host we see the following output:

    [293] [2015-02-02 01:08:21.902736000] [1:9 base dnsqr] [e9b019b8] [] [] 
    query_ip: <redacted>
    response_ip: <redacted>
    proto: UDP (17)
    query_port: 31211
    response_port: 53
    id: 7644
    qname: <redacted>
    qclass: IN (1)
    qtype: AAAA (28)
    rcode: NOERROR (0)
    delay: 0.182413
    udp_checksum: ABSENT
    [352] [2015-02-02 01:08:22.095911000] [1:9 base dnsqr] [e9b019b8] [] [] 

Two presentation format dnsqr NMSGs (redacted and cropped for publication) are emitted to stdout.

As a side-note, the sender has options to tune the network performance including setting the NMSG container maximum transmission unit size (note this is distinct from IP MTU), buffering, and rate limiting.

Payload Striping vs Mirroring

When multiple outputs are specified, nmsgtool defaults to striping payloads across each output. However, nmsgtool can also be configured to mirror payloads to each output.

    $ nmsgtool -C ch211 -c 100 -o - -s --mirror
    [94] [2015-02-03 08:50:19.277158975] [2:5 SIE newdomain] [a1ba02cf] [] []

The above invocation reads 100 payloads [-c 100] from SIE Channel 211 (Newly Active Domains) and mirrors [--mirror] across two outputs, one ASCII presentation to stdout [-o -] and one network socket destined for on UDP port 9430 [-s].

Input from a Network Interface or Pcap File with BPF filtering

Perhaps you'd like to create your own NMSG stream sourced from live network traffic. More specifically, you want only DNS traffic. To this end, you can configure nmsgtool to read IP datagrams directly from a network interface or a pcap file. Additionally, a BPF can be specified to winnow packets. When receiving input from a network interface or a pcap file, nmsgtool requires the user to set the vendor and message type so it knows how to encode each payload.

    $ nmsgtool -i eth1 -V base -T dnsqr -b "udp 53"
    [220] [2010-05-09 05:08:54.951124000] [1:9 base dnsqr] [00000000] [] [] 

The above invocation reads data from a network interface [-i eth1] and winnows those packets to just UDP port 53 [-b "udp 53"] and encodes this data as base/dnsqr [-V base] and [-T dnsqr] and emits as ASCII presentation data to stdout.

Input from a pcap file is syntactically similar, just substitute [-p example.pcap] for [-i eth1].

Coming up

The next article in the NMSG series will examine low-level NMSG implementation details such as header composition and data encoding. Future articles will introduce the programming APIs.

Mike Schiffman is a Protocol Legerdemainist for Farsight Security, Inc.

Read the next part in this series: Network Message, Volume 3: Headers and Encoding