What's SIE Batch? Why Might I Be Interested In It?
By Joe St Sauver
Some people think Farsight's Security Information Exchange (SIE) is like "radar for the Internet," a way for you to watch what's happening online in near-real time. Others think of SIE as being like "having your own Internet sensor network," without having to go through the hassle of actually arranging to collect and process data from all over the Internet yourself.
Because SIE offers diverse types of Internet security data, data streams at SIE are divided into channels. Subscribers pick the SIE channel that's of interest to them. Some may be interested in DNS-related data, while others may be interested in spam samples or newly observed domain names. You can pick the channel (or channels) you're interested in, and ignore the rest.
This article describes the latest way to access SIE channels – SIE Batch. SIE Batch will debut at the RSA Conference next week on February 24th.
2. Understanding SIE Access Options
Once you subscribe to an SIE channel, you need a way to access that data. With the introduction of SIE Batch, you now have four options to choose from:
SIE LAN: In this option, a customer normally leases a blade server from Farsight. That blade server physically connects to SIE via one of its Ethernet interfaces. If you're planning on doing extensive data reduction at SIE (rather than at your own location), or you want to routinely work with some of our highest volume channels, SIE LAN is an excellent option.
SIE Remote Access (aka "SRA"): In the SRA model, an encrypted tunnel is used to bring SIE data to a remote user. This is convenient if you don't need continual/ongoing access, or you're working with lower-volume channels, or you want to feed data directly to existing data reduction systems that already exist at the customer's site. Once the tunnel has been instantiated, you can work with the data on the tunnel the same way you'd work with SIE data from a directly connected blade.
AXAMD: AXAMD adds a RESTful interface to SRA.
SIE Batch: SIE Batch is the new option to access data on an SIE channel. In a nutshell, subscribers can just visit a web page, click on a button, and download a recent sample of data for a given channel they subscribe to (where that sample might be anywhere from a few minutes to 12 hours or more in length). Users can also directly download data for an arbitrary period of time from the data that's available. There's also an SIE Batch API that will let you easily access SIE data (in file format) from a program or script.
This article is focused on option 4, SIE Batch.
3. A "Quick Download" Example of Using The SIE Batch GUI
Sometimes you just want to "dunk a bucket into the river of data" that's flowing past. SIE Batch is great for that sort of "pull me a sample" process. To try it:
a. Contact Farsight Security Sales to arrange for access to one or more SIE channels of interest. The Farsight Security Sales Team can be reached at email@example.com or give them a call at +1-650-489-7919.
Be sure to mention that you want to access that channel via SIE Batch. Access to the SIE Batch GUI, API, and related API documentation is available only to current subscribers.
b. Once you've received an SIE Batch API key from Farsight, visit here in Firefox or Chrome.
You should see:
Cut and paste your SIE Batch API key into the box and click LOGIN. (Note that this API key is NOT the same as the DNSDB API key you may already have.)
c. You should then see a screen that looks like:
d. If you want to see the size of the file that would be downloaded for any given time duration, hover over the relevant time button:
e. If the reported length looks okay, click the button to download the file. In Firefox, you'll see a popup that looks like:
Hit return (or click OK) to save that file.
The resulting file (in this case) is newline delimited JSON, and can be processed with
jq or another JSON manipulation tool of your choice. You may notice that the filename, while "complex looking" is actually built from the channel name plus the starting datetime and the ending datetime.
Anyhow, this is a great way to grab a quick sample of what's currently going on in a channel.
4. A "Direct Download" Example
Other times you may be routinely downloading data from a channel via one of the other mechanisms SIE uses (such as SRA), but find that you need to re-download data for a specific period, perhaps because the system that you normally use to download data unexpectedly ran out of disk space (oops!). If that's the case, the "Direct Download" option can potentially "save your bacon."
SIE Batch's "Direct Download" option will let you specify a specific starting and ending time of interest within the data that's currently cached. For example, perhaps you want to download data for channel 212 for the period from 15:35 (local time) on February 18th, 2020 to 10:57 on February 19th, 2020 local time. You'd specify that channel, those times and click START to set up that download, and then you'd click DOWNLOAD to actually do that download:
It's really just that easy!
5. The File I Downloaded Is An NMSG Binary File!
While the sample file shown in sections 3 and 4 was in newline delimited JSON format, some datasets may be distributed in the highly efficient
NMSG binary format, instead. In those cases,
nmsgtool can be used to convert the downloaded files. If you're using Debian Linux, you can install
nmsgtool as a package. Source code is also available for those who prefer to build from source, or for use on systems other than Debian Linux.
To see data decoded from NMSG format into normal presentation format:
$ nmsgtool -r yourNMSGfile.nmsg
To convert the file to JSON lines format:
$ nmsgtool -r yourNMSGfile.nmsg -J -
6. What About The SIE Batch API?
Developers interested in integrating SIE Batch in a script or program should see the detailed information available online. We will also show examples of working with the SIE Batch API in future blog posts.
We hope you've enjoyed learning a little about SIE Batch, and we'd love to hear what you think if you give SIE Batch a try! Please feel free to contact Farsight at firstname.lastname@example.org.
I want to extend a special thank you to my colleague, Tyler Wood, for his assistance in reviewing and helping to finalize this article.
Joe St Sauver Ph.D. is a Distinguished Scientist with Farsight Security®, Inc.
Read the next part in this series: SIE Batch API: A libcurl example in C ("sie_get")