Thursday, February 7, 2019

Splunk and ELK – Impartial Comparison Part II - Differences


By Tony Lee

In the first part of our series (http://securitysynapse.blogspot.com/2019/02/splunk-and-elk-impartial-comparison-part-i.html), we discussed the similarities between Splunk and the ELK stack.  Part II will discuss some of the differences in terms of limitations.  Not all of these are deal breakers and they cannot necessarily be scored as one for one in terms of importance.  But it is good for folks to know the differences before implementing one platform vs. the other.  We welcome the reader to chime in with their own limitations (or corrections) as well.  We will start off with the Splunk limitations and then follow up with the ELK limitations.  Remember, these are not necessarily weighted equally in terms of importance (as that is determined by the end user), so we are not declaring a winner.

Splunk Limitations

-          ELK can easily create dynamically named indexes and keys, Splunk cannot
-          ELK can search on a wildcarded key…  For example:  search host.*=foo
-          ELK provides DevTools à Console:  a useful method for running commands against the ELK instance from the Kibana GUI
-          Splunk does not provide relevance weighting such as ELK’s _score field

ELK limitations

-          ELK does not allow piping of search commands to create more complex commands  ß This is one of the most difficult differences to overcome when transitioning from Splunk to ELK
-          Splunk is considered “Schema on read”, which means you can throw pretty much anything at it and it may autoparse or can be parsed later.  ELK requires more upfront parsing to make use of the data.
-          There is no central manager for beat agents, Splunk includes a deployment server for free which manages Universal Forwarders
-          discuss.elastic.co closes threads after 60 days of inactivity…  Splunk Answers never closes a thread and thus users can contribute at any time – this helps prevent duplicate entries and stale worthless data
-          Installation of Splunk can be completed in minutes, ELK takes much more time and is more dependent upon versions of each component since there is no unified installer
-          Kibana can only sort on numeric fields and not alphabetical fields
-          It appears that Splunk has more mathematical/statistical functions out of the box
-          ELK has a separate beat for collecting different sources/components of a system.  Splunk has a single Universal Forwarder that can collect different data sources by using a flexible configuration file.
-          ELK time range selector is missing a range for:  Quick à All time
-          ELK may introduce significant “breaking changes” on new version releases which can cause some customers to become stuck on a certain version of the platform.  Splunk seems to be very careful not to do this and it is rare and often not as limiting if it does occur.


Conclusion

This should serve as an initial list of limitations for both platforms.  Again, we will not declare a winner because some of those limitations may not matter to the end user, however it is good to get the list out in the open for discussion.  Both platforms are always looking for ways to innovate and improve the customer experience.  These lists are often a good start for that purpose and competition is definitely a good thing. If you have a correction, please keep it constructive and it will get posted in the comments section below.  Thanks for reading. 😉

Tuesday, February 5, 2019

Splunk and ELK – Impartial Comparison Part I - Similarities


By Tony Lee

This series is not intended to start a “Big Data” holy war, but instead hopefully offer some unbiased insight for those looking to implement Splunk, ELK or even both platforms.  After all both platforms are highly regarded in their abilities to collect, parse, analyze, and display log data.  In fact, the first article in this series will show how the two competing technologies are similar in the following areas:
  • Purpose
  • Architecture
  • Cost

Caveat

Most articles on this subject seem to have some sort of agenda to push folks in one direction or another—so we will do our absolute best to keep it unbiased. We admit that we know Splunk better than we know the ELK stack, so we are banking on ELK (and even Splunk) colleagues and readers to help keep us honest. Lastly, our hope is to update this article as we learn or receive more information and the two products continue to mature.

Similar Purpose

Both Splunk and ELK stack are designed to be highly efficient in log collection and search while allowing users to create visualizations and dashboards.  The similar goal and purpose of the two platforms naturally means that many of the concepts are also similar.  One minor annoyance is that the concepts are referred to by different names.  Thus, the table below should help those that are familiar with one platform map ideas and concepts to the other.


Splunk
ELK Stack
Search Head
Kibana
Indexer
Elastic Search
Forwarder
Logstash
Universal Forwarder
Beats (Filebeat, Metricbeat, Packetbeat, Winlogbeat, Auditbeat, Heartbeat, etc.)
Search Processing Language (SPL)
Lucene query syntax
Panel
Panel
Index
Index


Similar Architecture

In many ways, even the architecture between Splunk and ELK are very similar.  The diagram below highlights the key components along with the names of each component in both platforms.

Figure 1:  Architectural similarities

Cost

This is also an area where there are more similarities than most would imagine due to a misconception that ELK (with comparable features to Splunk) is free.  While the core components may be free, the extensions that make ELK an enterprise-scalable log collection platform are not free—and this is by design.  According to Shay Banon, Founder, CEO and Director of Elasticsearch:

“We are a business. And part of being a business is the belief that those businesses who can pay us, should. And those who cannot, should not be paying us. In return, our responsibility is to ensure that we continue to add features valuable to all our users and ensure a commercial relationship with us is beneficial to our customers. This is the balance required to be a healthy company.”

Elastic does this by identifying “high-value features and to offer them as commercial extensions to the core software. This model, sometimes called ‘open core’, is what culminated in our creation of X-Pack. To build and integrate features and capabilities that we maintain the Intellectual Property (IP) of and offer either on a subscription or a free basis. Maintaining this control of our IP has been what has allowed us to invest the vast majority of our engineering time and resources in continuing to improve our core, open source offerings.”


That said, which enterprise-critical features aren’t included in the open source or even basic free license?  The subscription comparison screenshot found below shows that one extension not included for free is Security (formerly Shields).  This includes Encrypted communications, Role-based Access Control (RBAC), and even authentication.  Most would argue that an enterprise needs a login page and the ability to control who can edit vs. view searches, visualizations, and dashboards, thus it is not a fair comparison to say that Splunk costs money while ELK is free.  There are alternatives to X-PACK, but we will leave that to another article since it is not officially developed and maintained as part of the ELK stack.

Figure 2:  Encryption, RBAC, and even authentication is not free
In terms of host much Splunk costs vs. ELK, there are also many arguments there--some of which include the cost of build time, maintenance, etc.  It mostly depends on your skills to negotiate with each vendor.

Conclusion

Splunk and ELK stack are similar in many ways.  In fact, knowing one platform can help a security practitioner learn the other because many of the concepts are close enough to transfer.  The reduction in the learning curve is a huge advantage for those that need to convert from one platform to the other.  That said, there are differences, however we will discuss those in the next article.  In the meantime, we hope that this article was useful for you and we are open to feedback and corrections, so feel free to leave your comments below.  Please note that any inappropriate comments will not be posted—thanks in advance.  😊

Wednesday, January 30, 2019

rsyslog fun - Basic Splunk Log Collection and Forwarding - Part II

By Tony Lee

Welcome to part II in our series covering how to use rsyslog to route and forward logs to Splunk. Please see Part I of the series (http://securitysynapse.blogspot.com/2019/01/rsyslog-fun-basic-splunk-log-collection-part-i.html) for the basics in opening ports, routing traffic by IP address or hostname, and monitoring files to send the data on to Splunk Indexers. As a reminder, choosing between rsyslog, syslog-ng, or other software is entirely up to the reader and may depend on their environment and approved/available software. We also realize that this is not the only option for log architecture or collection, but it may help those faced with this task—especially if rsyslog is the standard in their environment. That said, let's look at some more advanced scenarios concerning file permissions, routing logs via regex, and routing logs via ports. We will wrap up with some helpful hints on a possible method to synchronize the rsyslog and Splunk configuration files.

File Permissions

There are times where you may need to adjust the file permissions for the files that rsyslog is writing to disk. For example, if following best practice and running the Splunk Universal Forwarder as a lower privileged account, it will need access to the logs files.  Using the following rsyslogd.conf directives at the top of the configuration file will change the permissions on the directories and files created.  The following example creates directories with permissions of 755 and files with a permission of 644:



$umask 0000
$DirCreateMode 0755
$FileCreateMode 0644



Routing logs via Regex

Another more advanced rsyslog option is the ability to drop or route data at the event level via regex. For example, maybe you want to drop certain packets -- such as Cisco Teardown packets generated from ASA's. Note: this rsyslog ability is useful since we are using Splunk Universal Forwarders in our example and not Splunk Heavy Forwarders.

Or maybe you have thousands of hosts and don't want to maintain a giant list of IP addresses in an if-statement. For example, maybe you want to route thousands of Cisco Meraki host packets to a particular file via a regex pattern.

Possibly even more challenging would be devices in a particular CIDR range that end in a specific octet.

These three examples are covered in the rsyslog.conf snippet below:



#Drop Cisco ASA Teardown packets
:msg, contains, ": Teardown " ~
& stop

#Route Cisco Meraki hosts to specific directory
if ($msg contains ' events type=') then ?ciscoMerakiFile
& stop

#ICS Devices 10.160.0.0/11 (last octet being .150)
:fromhost-ip, regex, "10\.\\(1[6-8][0-9]\\|19[0-1]\\)\..*\.150" -?icsDevices

& stop



Routing logs via Port

I know we just provided you the ability to route packets via regex, however sometimes that can be inefficient--especially at high events per second. If you are really fortunate, the source sending the data has the ability to send to a different port. Then it may be worth looking into routing data to different files based on port.  The example file below provides port 6517 and 6518 as an example.



#Dynamic template names
template(name="file6517" type="string" string="/rsyslog/port6517/%FROMHOST%/%$YEAR%-%$MONTH%-%$DAY%.log")

template(name="file6518" type="string" string="/rsyslog/port6518/%FROMHOST%/%$YEAR%-%$MONTH%-%$DAY%.log")

#Rulesets 
ruleset(name="port6517"){
    action(type="omfile" dynafile="file6517")

}

ruleset(name="port6518"){
    action(type="omfile" dynafile="file6518")
}

input(type="imtcp" port="6517" ruleset="port6517")
input(type="imtcp" port="6518" ruleset="port6518")



Synchronizing Multiple Rsyslog Servers

Since our architecture in part I outlined using a load balancer and multiple rsyslog servers, we will eventually need a way to synchronize the configuration files across the multiple rsyslog servers.  The example below provides two bash shell scripts to perform just that task. The first one will synchronize the rsyslog configuration and the second will synchronize the Splunk configuration--both scripts restart the respective service. Note: This is not the only method available for synchronization, but it is one possible method. Remember to replace <other_server> with the actual IP or FQDN of that server.

On the rsyslog server that you make the changes on, create these two bash scripts and modify the <other_server> section. Once you make a change to the rsyslog or Splunk UF configuration, run the necessary script.

sync-rsyslog.sh



scp /etc/rsyslog.conf <other_server>:/etc/rsyslog.conf
ssh <other_server> service rsyslog restart



sync-splunk.sh


scp /opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/local/inputs.conf <other_server>:/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/local/inputs.conf
ssh <other_server> /opt/splunkforwarder/bin/splunk restart



Conclusion

In this article, we outlined key advanced features within rsyslog that may not be immediately evident. Hopefully this article will save you some Googling time when trying to operationalize log collection and forwarding using rsyslog in your environment. After all, eventually you will probably need to deal with file permissions, routing logs via regex and/or port, and configuration synchronization. We hope you enjoyed the article and found it useful.  Feel free to post your favorite tips and tricks in the comments section below. Happy Splunking!




Sunday, January 27, 2019

rsyslog fun - Basic Splunk Log Collection and Forwarding - Part I


By Tony Lee

We found it a bit surprising that there are so few articles on how to use an rsyslog server to forward logs to Splunk. This provided the motivation to write this article and hopefully save others some Googling. Choosing between rsyslog, syslog-ng, or other software is entirely up to the reader and may depend on their environment and approved/available software. We realize that this is not the only option for log architecture or collection, but it may help those faced with this task—especially if rsyslog is the standard in their environment.

Warnings

Before we jump in, we wanted to remind you of three potential gotchas that may thwart your success and give you a troubleshooting migraine.
  1. Network firewalls – You may not own this, but make sure the network path is clear
  2. iptables – Complex rule sets can throw you for a loop
  3. SE Linux – Believe it or not, SE Linux when locked down can prevent the writing of the log files

If something is not working the way you expect it to work, it is most likely due to one of the three items mentioned above. It could be worth temporarily disabling them until you get everything working. Just don’t forget to go back and lock it down.

Note:  We will also be using Splunk Universal Forwarders (UF) in this article.  Universal Forwarders have very little pre-processing or filtering capabilities when compared to Heavy Forwarders.  If significant filtering is necessary, consider using a Splunk Heavy Forwarder in the same fashion as we are using the UFs below.

Architecture

Whether your Splunk instance is on-prem or it is in the cloud, you will most likely need syslog collectors and forwarders at some point. The architecture diagram below shows one potential configuration. The number of each component is configurable and dependent upon the volume of traffic.



Figure 1:  Architecture diagram illustrating traffic flow from data sources to the Index Cluster

Rsyslog configuration

Rsyslog is a flexible service, but in this case rsyslog’s primary role will be to:

  • Open the sockets to accept data from the sources
  • Properly route traffic to local temporary files that Splunk will forward on to the indexers

If you are fortunate enough to be able to route network traffic to different ports, you may be able to reduce the if-then logic shown below for routing the events to separate files. In this case, we were not able to open separate ports from the load balancer, thus we needed to do the routing on our end. In the next article we will cover more advanced routing to include regex and traffic coming in on different ports.

Note:  Modern rsyslog is designed to run extra config files that exist in the /etc/rsyslog.d/ directory. If that directory exists, place the following 15-splunk-rsyslog.conf file in that directory. Otherwise, the /etc/rsyslog.conf file is interpreted from top to bottom, so make a copy of your current config file (cp /etc/rsyslog.conf /etc/rsyslog.bak) and selectively add the following at the top of the new active rsyslog.conf file. This addition to the rsyslog configuration will do the following (assuming the day is 2018-06-01:
  • Open TCP and UDP 514
  • Write all data from 192.168.1.1 to:  /rsyslog/cisco/192.168.1.1/2018-06-01.log
  • Write all data from 192.168.1.2 to:  /rsyslog/cisco/192.168.1.2/2018-06-01.log
  • Write all data from 10.1.1.* to /rsyslog/pan/10.1.1.*/2018-06-01.log (where * is the last octet of the source IP
  • Write all remaining data to /rsyslog/unclaimed/<host>/2018-06-01.log (where <host> is the source IP or hostname of the sender)
Note:  If the rsyslog server sees the hosts by their hostname instead of IP address, feel free to use $fromhost == '<hostname>' in the configuration file below.

/etc/rsyslog.d/15-splunk-rsyslog.conf


$ModLoad imtcp
$ModLoad imudp
$UDPServerRun 514
$InputTCPServerRun 514

# do this in FRONT of the local/regular rules

$template ciscoFile,"/rsyslog/cisco/%fromhost%/%$YEAR%-%$MONTH%-%$DAY%.log"
$template PANFile,"/rsyslog/pan/%fromhost%/%$YEAR%-%$MONTH%-%$DAY%.log"
$template unclaimedFile,"/rsyslog/unclaimed/%fromhost%/%$YEAR%-%$MONTH%-%$DAY%.log"

if ($fromhost-ip == '192.168.1.1' or $fromhost-ip == '192.168.1.2') then ?ciscoFile
& stop

if $fromhost-ip startswith '10.1.1' then ?PANFile
& stop

else ?unclaimedFile
& stop

# local/regular rules, like
*.* /var/log/syslog.log



Note:  Rsyslog should create directories that don't already exist, but just in case it doesn't, you need to create the directories and make them writable.  For example:


mkdir -p /rsyslog/cisco/
mkdir -p /rsyslog/pan/
mkdir -p /rsyslog/unclaimed/



Pro tip:  After making changes to the rsyslog config file, you can verify that there are no syntax errors BEFORE you restart the rsyslog daemon.  For a simple rsyslog config validation.  Try using the following command: 

rsyslogd -N 1

If there are no errors, then you should be good to restart the rsyslog service so your changes take effect:

service rsyslog restart

Log cleanup

The rsyslog servers in our setup are not intended to store the data permanently. They are intended to act as a caching server for temporary storage before shipping the logs off to the Splunk Indexers for proper long-term storage. Since disk space is not unlimited on these caching servers we will need to implement log rotation and deletion so we do not fill up the hard disk. Our rsyslog config file already takes care of the log rotation with the template parameter specifying the name of the file as “%$YEAR%-%$MONTH%-%$DAY%.log", however, we still need to clean up the files, so they don’t sit there indefinitely. One possible solution is to use a daily cron job to clean up files in the /rsyslog/ directory that are more than x days old (where x is defined by the organization). Once you have some files in the /rsyslog/ directory, try the following command to see what would potentially be deleted. The command below lists files in the rsyslog directory that are older than two days.

find /rsyslog/ -type f -mtime +1 -exec ls -l "{}" \;

If you are happy with a two-day cache period, add it to a daily cron job (as shown below).  Otherwise feel free to play with the +1 until you are comfortable with what it will delete and use that for your cron job.

/etc/cron.daily/logdelete.sh


find /rsyslog/ -type f -mtime +1 -delete



Splunk Universal Forwarder (UF) Configuration

Splunk Forwarders are very flexible in terms of data ingest. For example, they can create listening ports, monitor directories, run scripts, etc. In this case, since rsyslog is writing the information to a directory, we will use a Splunk UF to monitor those directories and send them to the appropriate indexes and label them with the appropriate sourcetypes.  See our example configuration below.

Note:  Make sure the indexes mentioned below exist prior to trying to send data there. These will need to be created within Splunk.  Also ensure that the UF is configured to forward data to indexers (out of the scope of this write up).

/opt/splunkforwarder/etc/apps/SplunkForwarder/local/inputs.conf 


[monitor:///rsyslog/cisco/]
whitelist = \.log$
host_segment=3
sourcetype = cisco:ios
index = cisco

[monitor:///rsyslog/pan/]
whitelist = \.log$
host_segment=3
sourcetype = pan:traffic
index = pan_logs

[monitor:///rsyslog/unclaimed/]
whitelist = \.log$
host_segment=3
sourcetype = syslog
index = lastchanceindex



Pro tip:  Remember to restart the Splunk UF after modifying files.  

/opt/splunkforwarder/bin/splunk restart

Conclusion

A simple Splunk search of index=cisco, index=pan_logs, or index=lastchanceindex should be able to confirm that you are now receiving data in Splunk. Keep monitoring the lastchanceindex to move hosts to where they need to go as they come on-line. Moving the hosts is accomplished by editing the rsyslog.conf file and possibly adding another monitor stanza within the Splunk UF config. This process can be challenging to create, but once it is going, it just needs a little care from time to time to make sure that all is well.  We hope you found this article helpful.  Happy Splunking!

References



Wednesday, January 9, 2019

Parsing and Displaying Infoblox DHCP Data in Splunk

By Tony Lee

This article builds on our Infoblox DNS article available at:  http://securitysynapse.com/2019/01/parsing-and-displaying-infoblox-dns-in-splunk.html

If you are reading this page chances are good that you have both Splunk and Infoblox DHCP. While there is a pre-built TA (https://splunkbase.splunk.com/app/2934/) to help with the parsing, we needed some visualizations, so we wrote them and figured we would share what we created.


Figure 1:  At the time of writing this article, only a TA existed for Infoblox DHCP.

If you have this same situation, hopefully we can help you too. As a bonus, we will include the dashboard code at the end of the article.

Figure 2:  Dashboard that we include at the end of the article

Raw Log

This is what an Infoblox raw log might look like:

Sep 4 09:23:44 10.34.6.28 dhcpd[20310]: DHCPACK on 70.1.20.250 to fc:5c:fc:5f:10:85 via eth1 relay 10.120.20.66 lease-duration 600

Source:  https://docs.infoblox.com/display/NAG8/Using+a+Syslog+Server


Fields to Parse

Fortunately, our job is taken care of by the Infoblox TA (https://splunkbase.splunk.com/app/2934/)!  Just use the sourcetype of infoblox:dhcp to ensure it is properly parsed.

Search String

Now that the data is parsed, we can use the following to table the data:

index=infoblox sourcetype="infoblox:dhcp" | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay

Combine a few panels together and we will have a dashboard similar to the one in the dashboard code section at the bottom of the article.

Conclusion

Even though we only had a Splunk TA (and not an app to go with it), we used the flexibility provided within Splunk to gain insight into Infoblox DHCP logs. We hope this article helps other save time. Feel free to leave comments in the section below. Happy Splunking!

Dashboard Code

The following dashboard assumes that the appropriate logs are being collected and sent to Splunk. Additionally, the dashboard code assumes an index of infoblox. Feel free to adjust as necessary. Splunk dashboard code provided below:


<form>
  <label>Infoblox DHCP</label>
  <description>This is a high volume data feed - Be mindful of your time range</description>
  <fieldset submitButton="true">
    <input type="time" token="time" searchWhenChanged="true">
      <label>Time Range</label>
      <default>
        <earliest>-4h@h</earliest>
        <latest>now</latest>
      </default>
    </input>
    <input type="text" token="wild" searchWhenChanged="true">
      <label>Wildcard Search</label>
      <default>*</default>
    </input>
  </fieldset>
  <row>
    <panel>
      <table>
        <title>Total DHCP Traffic by Infoblox Host</title>
        <search>
          <query>| tstats count where index=infoblox, sourcetype="infoblox:dhcp" by host</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top Action</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 action</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top signature</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 signature</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top Servicing Host</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 src_hostname</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top src_ip</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 src_ip</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top dest_ip</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ |  table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 dest_ip</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <table>
        <title>Raw Logs</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="refresh.display">progressbar</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
  </row>
</form>




Friday, January 4, 2019

Parsing and Displaying Infoblox DNS Data in Splunk

By Tony Lee

If you are reading this page chances are good that you have both Splunk and Infoblox DNS. While there is a pre-built TA (https://splunkbase.splunk.com/app/2934/) to help with the parsing, we needed some visualizations, so we wrote them and figured we would share what we created.


Figure 1:  At the time of writing this article, only a TA existed for Infoblox DNS.

If you have this same situation, hopefully we can help you too. As a bonus, we will include the dashboard code at the end of the article.

Figure 2:  Dashboard that we include at the end of the article

Raw Logs


DNS Query

This is what an Infoblox query might look like:

30-Apr-2013 13:35:02.187 client 10.120.20.32#42386: query: foo.com IN A + (100.90.80.102)


The fields are the following:

<dd-mmm-YYYY HH:MM:SS.uuu> <client IP>#<port> query: <query_Domain name> <class name> <type name> <- or +>[SETDC] <(name server ip)>

where
+ = recursion 
- = no recursion 
S = TSIG 
E = EDNS option set 
T = TCP query 
D = EDNS ‘DO’ flag set 
C = ‘CD’ message flag set



DNS Response

This is what an Infoblox response might look like for an A record query:

07-Apr-2013 20:16:49.083 client 10.120.20.198#57398 UDP: query: a2.foo.com IN A response: NOERROR +AED a2.foo.com. 28800 IN A 1.1.1.2;

Where the fields are the following:

<dd-mmm-YYYY HH:MM:SS.uuu> client <client ip>#port <UDP or TCP>: [view: DNS view] query: <queried domain name> <class name> <type name> response: <rcode> <flags> [<RR in text format>; [<RR in text format>;] ...]

Flags = <- or +>[ATEDVL]

where

- = recursion not available
+ = recursion available (from DNS message header)
A = authoritative answer (from DNS message header)
t = truncated response (from DNS message header)
E = EDNS OPT record present (from DNS message header)
D = DNSSEC OK (from EDNS OPT RR)
V = responding server has validated DNSSEC records
L = response contains DTC synthetic record 

Source:  https://docs.infoblox.com/display/NAG8/Capturing+DNS+Queries+and+Responses


Fields to Parse

Unfortunately, the Infoblox TA (https://splunkbase.splunk.com/app/2934/) does not seem to parse all the fields, but it might get you relatively close.  Just use the sourcetype of infoblox:dns.

Search String

Now that the data is somewhat parsed, we can use the following to table the data:

index=infoblox sourcetype="infoblox:dns" | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 dns_request_client_ip

Combine a few panels together and we will have a dashboard similar to the one in the dashboard code section at the bottom of the article.

Conclusion

Even though we only had a Splunk TA (and not an app to go with it), we used the flexibility provided within Splunk to gain insight into Infoblox DNS logs. We hope this article helps other save time. Feel free to leave comments in the section below. Happy Splunking!

Dashboard Code

The following dashboard assumes that the appropriate logs are being collected and sent to Splunk. Additionally, the dashboard code assumes an index of infoblox and a sourcetype of infoblox:dns. Feel free to adjust as necessary. Splunk dashboard code provided below:


<form>
  <label>Infoblox DNS</label>
  <description>This is a high volume data feed - Be mindful of your time range</description>
  <fieldset submitButton="true">
    <input type="time" token="time" searchWhenChanged="true">
      <label>Time Range</label>
      <default>
        <earliest>-15m</earliest>
        <latest>now</latest>
      </default>
    </input>
    <input type="text" token="wild" searchWhenChanged="true">
      <label>Wildcard Search</label>
      <default>*</default>
    </input>
  </fieldset>
  <row>
    <panel>
      <table>
        <title>Total DNS Traffic by Infoblox Host</title>
        <search>
          <query>| tstats count where index=infoblox, sourcetype="infoblox:dns" by host</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="drilldown">none</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top dns_request_client_ip</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 dns_request_client_ip</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top message_type</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 message_type</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top record_type</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 record_type</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top query</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 query</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <table>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="refresh.display">progressbar</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
  </row>
</form>