eBook Now Available: Using Farsight Passive DNS for Incident Response - Download now!

← Farsight Blog

Creating Pandas Dataframes and CSV Files from DNSDB API Output Using Python3

By

RSS

I. Introduction

Many DNSDB API users rely on Python3 and Pandas for their DNS data science. The basic input to many analyses in that framework is the dataframe.

Some may wonder how they can retrieve data from DNSDB API and turn it into a Pandas dataframe. If you've got that question, this article is for you.

There's one other audience that may also be interested in this article, and those are folks who still use general purpose tools such as spreadsheets to do their runs. Those sorts of users often find that comma-separated variable files ("CSVs") are the easiest way to move data around. Given that reality, we'll show you how to make a CSV file from DNSDB API data, too. (Of course, you could also just use dnsdbq -p csv to get CSV format output from DNSDB API, too.).

The sample code we show below:

  • Retrieves some data from DNSDB API using Requests III

  • The sample code then processes that DNSDB API JSON Lines output with the Python JSON library, printing it and putting it into a Python3 list/arrray

  • We then create a Pandas dataframe from the list/array

  • And finally, the code creates CSV files from the Pandas dataframe.

We assume you already have the DNSDB API key you'll need to do runs in DNSDB API. If you need an API key, see https://www.farsightsecurity.com/get-started/

II. The Code

In order to be able to run the following code, you'll need to have installed:

Once you've installed those dependencies (if they aren't already installed), use your favorite editor to create a file that looks like:

$ cat read-dnsdb-api-to-pandas-df-and-csv.py
#!/usr/local/bin/python3 

import requests
import json
import pandas as pd

# select JSON Lines output, and supply a valid api-ley
myheaders = {'Accept': 'application/json', 'X-API-Key': 'yourAPIkeyGoesHere'}

# API endpoint per https://api.dnsdb.info   
url = 'https://api.dnsdb.info/lookup/rrset/name/' + 'www.reed.edu/A'

# optional parameters
myparams = {'limit': '1000', 'humantime': 'true'}

# make the request
r = requests.post(url, headers=myheaders, params=myparams, stream=True)

# how did it go?
print ("status code=",r.status_code, "\n")

# print what we got, and save it as an array (aka a "list")
myarray = []
for line in r.iter_lines(decode_unicode=True):
   if line: 
      myline=json.loads(line)
      print(myline)
      myarray.append(myline)

# prefer a Pandas dataframe to a list?
mydf = pd.DataFrame(myarray)

# want to save a copy as a CSV file?
mydf.to_csv("mysheet.csv", header=True, index=False)

Note that for the purpose of this quick example, we simply hardcoded a domain for our query, and we made no attempt to "handle" any error codes beyond simply displaying the status code. (You should see a 200 "success" reported status code when you try running this with your own key.)

We also only did a very straight forward query – – we did not attempt to demo all the bells and whistles. We just wanted to get most people rolling.

III. Running The Sample and Checking Out the Output

Let's make sure you really are "rolling." After creating the file shown above, run it with python3:

$ python3 read-dnsdb-api-to-pandas-df-and-csv.py
status code= 200 

{'count': 971009, 'time_first': '2010-06-24T17:12:52Z', 'time_last': '2019-05-29T16:37:26Z', 'rrname': 'www.reed.edu.', 'rrtype': 'A', 'bailiwick': 'reed.edu.', 'rdata': ['134.10.2.252']}
{'count': 13115, 'time_first': '2019-05-29T16:39:54Z', 'time_last': '2019-07-12T18:09:35Z', 'rrname': 'www.reed.edu.', 'rrtype': 'A', 'bailiwick': 'reed.edu.', 'rdata': ['134.10.50.30']}

The CSV file looks like:

$ cat mysheet.csv
bailiwick,count,rdata,rrname,rrtype,time_first,time_last
reed.edu.,971009,['134.10.2.252'],www.reed.edu.,A,2010-06-24T17:12:52Z,2019-05-29T16:37:26Z
reed.edu.,13115,['134.10.50.30'],www.reed.edu.,A,2019-05-29T16:39:54Z,2019-07-12T18:09:35Z

IV. Conclusion

This skeletal example shown above should be enough to show you how to retrieve results from DNSDB API using Python3 and Requests III.

We've shown you how to read that data into a Python list/array, and how to also write a copy of that data to a CSV file, perfect for those of you who may still prefer to work with spreadsheets.

For more information on how to subscribe to DNSDB API, please contact Farsight Security Sales at sales@farsightsecurity.com or give them a call at +1-650-489-7919.

Joe St Sauver Ph.D. is a Distinguished Scientist with Farsight Security®, Inc.

← Blog Home