Showing posts with label Splunk App. Show all posts
Showing posts with label Splunk App. Show all posts

Monday, July 1, 2019

Quick and Flexible IOC Hunting in Splunk

By Tony Lee and Arjun Mathew

Imagine that you are battling a known threat actor.  You have gathered indicators of compromise (IOCs) from reversing malware as well as helpful contributions from the rest of the security community.  But how could those IOCs be tasked across your existing data quickly in order to track attacker movement in real-time?  Here is one possible solution:
  1. Use a lookup file
  2. Clever Splunk search
  3. Even more clever dashboard
This article will outline the process and even share an example dashboard (shown in the screenshot below).

Figure 1:  Known IOC Dashboard provided at the end of the article

Lookup File

We used the following process to create a lookup file and definition.  Create a file in excel and save it as a CSV called known_iocs.csv (similar to the file below).

Figure 2:  CSV that we initially populated with our IOCs

Then within Splunk, navigate to the following to create the lookup and the definition:

Settings > Lookups > Lookup table files > Add new

  • Destination app:  Select the app
  • Upload a lookup file:  known_iocs.csv
  • Destination filename:  known_iocs.csv


Settings > Lookups > Lookup definitions > Add new

  • Destination app:  Select the app
  • Name:  known_iocs.csv
  • Type:  File-based
  • Lookup file:  known_iocs.csv


Now, here is the problem.  How do you scale this solution to a group effort to update a lookup table with IOCs?  It does not work well to pass the CSV around and then constantly upload.  Enter another graceful solution from Luke Murphey -- The Lookup File Editor Splunk App (https://splunkbase.splunk.com/app/1724/).

Figure 3:  Lookup File Editor App from Luke Murphey

Once the Lookup File Editor Splunk App is installed, navigate to it, search for your known_ioc.csv file.  Open it and right click on the bottom line and "Insert a new row".  You can edit the lookup file right in Splunk.  Once it is saved, the correlation searches will automatically run with the new IOC data.


Figure 4:  Inserting a new line to our known_ioc.csv file

Clever Search

Now that we have a lookup table that has our IOCs in it and a convenient way to edit it, we just need a search that will apply the IOCs to our data.  The example below applies the IOCs to the cylance_protect index, but feel free to change the index name as needed.  Additionally, we show how to search just one column of the IOC data as well as multiple columns.


One type of IOC (Hash):

index=cylance_protect [|inputlookup known_iocs.csv | rename Hash as query | table query] | stats count


Two types of IOCs (Hash & FileName)

index=cylance_protect [|inputlookup known_iocs.csv | rename Hash as query | table query] OR [|inputlookup known_iocs.csv | rename FileName as query | table query] | stats count

Note the OR statement between the two inputlookups -- needed when querying multiple columns.


Figure 5:  What will be our top panels showing a count of the hits


Even More Clever Dashboard

Now that we have functional searches, we need a dashboard to monitor our different data feeds such as:
  • Proxy
  • Firewalls
  • DNS
  • Antivirus Hits
  • Email Protection
  • Windows Event Logs

You can see in the screenshot below that we use Single Value panels on the top row.  Each of these panels contains a dynamic drilldown to populate the panel below it with the contents of the Single Value panel when clicked.


Figure 6:  Dashboard displayed at the start of the article and in the Sample Dashboard section below

The drilldown for each Single Value panel sets a token which is essentially the search, but without the stats count (feel free to table the data as needed):

        <drilldown>
          <set token="alert">index=proxy $wild$ [|inputlookup known_iocs.csv | rename Domain as query | table query] | table _raw</set>
        </drilldown>



Then the bottom panel is just a search of the token set in the drilldown above.

        <search>
          <query>| search $alert$</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>

Conclusion

Using a clever combination of features that already exist within Splunk (for the most part), we were able to create a quick method to update an IOC list and apply it against existing data within Splunk. Simply monitor these dashboards and use it to track the attacker's activities in real-time.


Sample Dashboard

The sample dashboard below uses a number of indexes to search over different data feeds.  Just change these indexes to the ones you are interested in monitoring.


<form>
  <label>Known IOC Hits</label>
  <description>Threat Actor</description>
  <fieldset submitButton="true">
    <input type="time" searchWhenChanged="true" token="time">
      <label>Time Range</label>
      <default>
        <earliest>-24h@h</earliest>
        <latest>now</latest>
      </default>
    </input>
    <input type="text" searchWhenChanged="true" token="wild">
      <label>Wildcard Search</label>
      <default>*</default>
    </input>
  </fieldset>
  <row>
    <panel>
      <single>
        <title>Proxy</title>
        <search>
          <query>index=proxy $wild$ [|inputlookup known_iocs.csv | rename Domain as query | table query] | stats count</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="colorMode">none</option>
        <option name="drilldown">all</option>
        <option name="rangeColors">["0x65a637","0xd93f3c"]</option>
        <option name="rangeValues">[0]</option>
        <option name="useColors">1</option>
        <drilldown>
          <set token="alert">index=proxy $wild$ [|inputlookup known_iocs.csv | rename Domain as query | table query] | table _raw</set>
        </drilldown>
      </single>
    </panel>
    <panel>
      <single>
        <title>Firewalls</title>
        <search>
          <query>index=firewalls $wild$ [|inputlookup known_iocs.csv | rename IP as query | table query] | stats count</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">all</option>
        <option name="rangeColors">["0x65a637","0xd93f3c"]</option>
        <option name="rangeValues">[0]</option>
        <option name="refresh.display">progressbar</option>
        <option name="useColors">1</option>
        <drilldown>
          <set token="alert">index=firewalls $wild$ [|inputlookup known_iocs.csv | rename IP as query | table query] | table _raw</set>
        </drilldown>
      </single>
    </panel>
    <panel>
      <single>
        <title>DNS</title>
        <search>
          <query>index=dns $wild$ [|inputlookup known_iocs.csv | rename Domain as query | table query] | stats count</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">all</option>
        <option name="rangeColors">["0x65a637","0xd93f3c"]</option>
        <option name="rangeValues">[0]</option>
        <option name="useColors">1</option>
        <drilldown>
          <set token="alert">index=dns $wild$ [|inputlookup known_iocs.csv | rename Domain as query | table query] | table _raw</set>
        </drilldown>
      </single>
    </panel>
    <panel>
      <single>
        <title>Antivirus Hits</title>
        <search>
          <query>index=av $wild$ [|inputlookup known_iocs.csv | rename Hash as query | table query] | stats count</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">all</option>
        <option name="rangeColors">["0x65a637","0xd93f3c"]</option>
        <option name="rangeValues">[0]</option>
        <option name="useColors">1</option>
        <drilldown>
          <set token="alert">index=av $wild$ [|inputlookup known_iocs.csv | rename Hash as query | table query] | table _raw</set>
        </drilldown>
      </single>
    </panel>
    <panel>
      <single>
        <title>Email Protection</title>
        <search>
          <query>index=mail_protection $wild$ [|inputlookup known_iocs.csv | rename Domain as query | table query] | stats count</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">all</option>
        <option name="rangeColors">["0x65a637","0xd93f3c"]</option>
        <option name="rangeValues">[0]</option>
        <option name="useColors">1</option>
        <drilldown>
          <set token="alert">index=mail_protection $wild$ [|inputlookup known_iocs.csv | rename Domain as query | table query] | table _raw</set>
        </drilldown>
      </single>
    </panel>
  </row>
  <row>
    <panel>
      <title>Information Table (Click one of the numbers above to populate this table with Details)</title>
      <table>
        <search>
          <query>| search $alert$</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
      </table>
    </panel>
  </row>
</form>

Wednesday, January 9, 2019

Parsing and Displaying Infoblox DHCP Data in Splunk

By Tony Lee

This article builds on our Infoblox DNS article available at:  http://securitysynapse.com/2019/01/parsing-and-displaying-infoblox-dns-in-splunk.html

If you are reading this page chances are good that you have both Splunk and Infoblox DHCP. While there is a pre-built TA (https://splunkbase.splunk.com/app/2934/) to help with the parsing, we needed some visualizations, so we wrote them and figured we would share what we created.


Figure 1:  At the time of writing this article, only a TA existed for Infoblox DHCP.

If you have this same situation, hopefully we can help you too. As a bonus, we will include the dashboard code at the end of the article.

Figure 2:  Dashboard that we include at the end of the article

Raw Log

This is what an Infoblox raw log might look like:

Sep 4 09:23:44 10.34.6.28 dhcpd[20310]: DHCPACK on 70.1.20.250 to fc:5c:fc:5f:10:85 via eth1 relay 10.120.20.66 lease-duration 600

Source:  https://docs.infoblox.com/display/NAG8/Using+a+Syslog+Server


Fields to Parse

Fortunately, our job is taken care of by the Infoblox TA (https://splunkbase.splunk.com/app/2934/)!  Just use the sourcetype of infoblox:dhcp to ensure it is properly parsed.

Search String

Now that the data is parsed, we can use the following to table the data:

index=infoblox sourcetype="infoblox:dhcp" | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay

Combine a few panels together and we will have a dashboard similar to the one in the dashboard code section at the bottom of the article.

Conclusion

Even though we only had a Splunk TA (and not an app to go with it), we used the flexibility provided within Splunk to gain insight into Infoblox DHCP logs. We hope this article helps other save time. Feel free to leave comments in the section below. Happy Splunking!

Dashboard Code

The following dashboard assumes that the appropriate logs are being collected and sent to Splunk. Additionally, the dashboard code assumes an index of infoblox. Feel free to adjust as necessary. Splunk dashboard code provided below:


<form>
  <label>Infoblox DHCP</label>
  <description>This is a high volume data feed - Be mindful of your time range</description>
  <fieldset submitButton="true">
    <input type="time" token="time" searchWhenChanged="true">
      <label>Time Range</label>
      <default>
        <earliest>-4h@h</earliest>
        <latest>now</latest>
      </default>
    </input>
    <input type="text" token="wild" searchWhenChanged="true">
      <label>Wildcard Search</label>
      <default>*</default>
    </input>
  </fieldset>
  <row>
    <panel>
      <table>
        <title>Total DHCP Traffic by Infoblox Host</title>
        <search>
          <query>| tstats count where index=infoblox, sourcetype="infoblox:dhcp" by host</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top Action</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 action</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top signature</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 signature</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top Servicing Host</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 src_hostname</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top src_ip</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 src_ip</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top dest_ip</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ |  table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay | top limit=0 dest_ip</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <table>
        <title>Raw Logs</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dhcp" $wild$ | table _time, host, action, signature, src_category, src_hostname, src_ip, src_mac, dest_category, dest_hostname, dest_ip, relay</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="refresh.display">progressbar</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
  </row>
</form>




Friday, January 4, 2019

Parsing and Displaying Infoblox DNS Data in Splunk

By Tony Lee

If you are reading this page chances are good that you have both Splunk and Infoblox DNS. While there is a pre-built TA (https://splunkbase.splunk.com/app/2934/) to help with the parsing, we needed some visualizations, so we wrote them and figured we would share what we created.


Figure 1:  At the time of writing this article, only a TA existed for Infoblox DNS.

If you have this same situation, hopefully we can help you too. As a bonus, we will include the dashboard code at the end of the article.

Figure 2:  Dashboard that we include at the end of the article

Raw Logs


DNS Query

This is what an Infoblox query might look like:

30-Apr-2013 13:35:02.187 client 10.120.20.32#42386: query: foo.com IN A + (100.90.80.102)


The fields are the following:

<dd-mmm-YYYY HH:MM:SS.uuu> <client IP>#<port> query: <query_Domain name> <class name> <type name> <- or +>[SETDC] <(name server ip)>

where
+ = recursion 
- = no recursion 
S = TSIG 
E = EDNS option set 
T = TCP query 
D = EDNS ‘DO’ flag set 
C = ‘CD’ message flag set



DNS Response

This is what an Infoblox response might look like for an A record query:

07-Apr-2013 20:16:49.083 client 10.120.20.198#57398 UDP: query: a2.foo.com IN A response: NOERROR +AED a2.foo.com. 28800 IN A 1.1.1.2;

Where the fields are the following:

<dd-mmm-YYYY HH:MM:SS.uuu> client <client ip>#port <UDP or TCP>: [view: DNS view] query: <queried domain name> <class name> <type name> response: <rcode> <flags> [<RR in text format>; [<RR in text format>;] ...]

Flags = <- or +>[ATEDVL]

where

- = recursion not available
+ = recursion available (from DNS message header)
A = authoritative answer (from DNS message header)
t = truncated response (from DNS message header)
E = EDNS OPT record present (from DNS message header)
D = DNSSEC OK (from EDNS OPT RR)
V = responding server has validated DNSSEC records
L = response contains DTC synthetic record 

Source:  https://docs.infoblox.com/display/NAG8/Capturing+DNS+Queries+and+Responses


Fields to Parse

Unfortunately, the Infoblox TA (https://splunkbase.splunk.com/app/2934/) does not seem to parse all the fields, but it might get you relatively close.  Just use the sourcetype of infoblox:dns.

Search String

Now that the data is somewhat parsed, we can use the following to table the data:

index=infoblox sourcetype="infoblox:dns" | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 dns_request_client_ip

Combine a few panels together and we will have a dashboard similar to the one in the dashboard code section at the bottom of the article.

Conclusion

Even though we only had a Splunk TA (and not an app to go with it), we used the flexibility provided within Splunk to gain insight into Infoblox DNS logs. We hope this article helps other save time. Feel free to leave comments in the section below. Happy Splunking!

Dashboard Code

The following dashboard assumes that the appropriate logs are being collected and sent to Splunk. Additionally, the dashboard code assumes an index of infoblox and a sourcetype of infoblox:dns. Feel free to adjust as necessary. Splunk dashboard code provided below:


<form>
  <label>Infoblox DNS</label>
  <description>This is a high volume data feed - Be mindful of your time range</description>
  <fieldset submitButton="true">
    <input type="time" token="time" searchWhenChanged="true">
      <label>Time Range</label>
      <default>
        <earliest>-15m</earliest>
        <latest>now</latest>
      </default>
    </input>
    <input type="text" token="wild" searchWhenChanged="true">
      <label>Wildcard Search</label>
      <default>*</default>
    </input>
  </fieldset>
  <row>
    <panel>
      <table>
        <title>Total DNS Traffic by Infoblox Host</title>
        <search>
          <query>| tstats count where index=infoblox, sourcetype="infoblox:dns" by host</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="drilldown">none</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top dns_request_client_ip</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 dns_request_client_ip</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top message_type</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 message_type</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top record_type</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 record_type</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Top query</title>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message | top limit=0 query</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
        </search>
        <option name="drilldown">cell</option>
        <option name="refresh.display">progressbar</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <table>
        <search>
          <query>index=infoblox sourcetype="infoblox:dns" $wild$ | table _time, host, message_type, record_type, query, dns_request_client_ip, dns_request_client_port,  dns_request_name_serverIP, named_message</query>
          <earliest>$time.earliest$</earliest>
          <latest>$time.latest$</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="refresh.display">progressbar</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
  </row>
</form>



Wednesday, September 13, 2017

Splunk Technology Add-on (TA) Creation Script

By Tony Lee


Introduction

If you develop a Splunk application, at some point you may find yourself needing a Technology Add-on (TA) to accompany the app. Essentially, the TA utilizes much of the app's files, except for the user interface (UI/views). TA's are typically installed on indexers and heavy forwarders to process incoming data. Splunk briefly covers the difference between as app and an add-on in the link below:

https://docs.splunk.com/Documentation/Splunk/6.6.3/Admin/Whatsanapp

Maintaining two codebases can be time consuming though. Instead, it is possible to develop one application and extract the necessary components to build a TA. There may be other solutions such as the Splunk Add-on Builder (https://splunkbase.splunk.com/app/2962/) , but I found this script below to be one of the easiest methods.

Approach

This could be written in any language, however my development environment is Linux-based. The quickest and easiest solution was to write the script using bash. Feel free to translate it to another language if needed though.

Usage

Usage is simple.  Just supply the name of the application and it will create the TA from the existing app.

The app should be located here (if not, change the APP_HOME variable in the script):

/opt/splunk/etc/apps/<AppName>

Copy and paste the bash shell script (Create-TA.sh) below to the /tmp directory and make it executable:

chmod +x /tmp/Create-TA.sh

Then run the script from the tmp directory and supply the application name:

Create-TA.sh <AppName>

Ex:  Create-TA.sh cylance_protect

Once complete, the TA will be located here:  /tmp/TA-<AppName>.spl

Code

#!/bin/bash
# Create-TA
# anlee2 - at - vt.edu
# TA Creation tool written in bash
# Input:  App name   (ex: cylance_protect)
# Output: /tmp/TA-<app name>.spl

# Path to the Splunk app home.  Change if this is not accurate.
APP_HOME="/opt/splunk/etc/apps"


##### Function Usage #####
# Prints usage statement
##########################
Usage()
{
echo "TA-Create v1.0
Usage:  TA-Create.sh <App name>

  -h = help menu

Please report bugs to anlee2@vt.edu"
}


# Detect the absence of command line parameters.  If the user did not specify any, print usage statement
[[ $# -eq 0 || $1 == "-h" ]] && { Usage; exit 0; }

# Set the app name and TA name based on user input
APP_NAME=$1
TA_NAME="TA-$1"

echo -e "\nApp name is:  $APP_NAME\n"


echo -e "Creating directory structure under /tmp/$TA_NAME\n"
mkdir -p /tmp/$TA_NAME/default /tmp/$TA_NAME/metadata /tmp/$TA_NAME/lookups /tmp/$TA_NAME/static /tmp/$TA_NAME/appserver/static


echo -e "Copying files...\n"
cp $APP_HOME/$APP_NAME/default/eventtypes.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/app.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/props.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/tags.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/transforms.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/static/appIcon.png  /tmp/$TA_NAME/static/appicon.png 2>/dev/null
cp $APP_HOME/$APP_NAME/static/appIcon.png  /tmp/$TA_NAME/appserver/static/appicon.png 2>/dev/null
cp $APP_HOME/$APP_NAME/README /tmp/$TA_NAME/ 2>/dev/null
cp $APP_HOME/$APP_NAME/lookups/* /tmp/$TA_NAME/lookups/ 2>/dev/null

echo -e "Modifying app.conf...\n"
sed -i s/$APP_NAME/$TA_NAME/g /tmp/$TA_NAME/default/app.conf
sed -i "s/is_visible = .*/is_visible = false/g" /tmp/$TA_NAME/default/app.conf
sed -i "s/description = .*/description = TA for $APP_NAME./g" /tmp/$TA_NAME/default/app.conf
sed -i "s/label = .*/label = TA for $APP_NAME./g" /tmp/$TA_NAME/default/app.conf


echo -e "Creating default.meta...\n"
cat >/tmp/$TA_NAME/metadata/default.meta <<EOL
# Application-level permissions
[]
access = read : [ * ], write : [ admin, power ]
export = system

### EVENT TYPES
[eventtypes]
export = system

### PROPS
[props]
export = system

### TRANSFORMS
[transforms]
export = system

### LOOKUPS
[lookups]
export = system

### VIEWSTATES: even normal users should be able to create shared viewstates
[viewstates]
access = read : [ * ], write : [ * ]
export = system
EOL

cd /tmp; tar -zcf TA-$APP_NAME.spl $TA_NAME


echo -e "Finished.\n\nPlease check for you file here:  /tmp/$TA_NAME.spl"

Conclusion

Hopefully this helps others save some time by maintaining one application and extracting the necessary data to create the technology add-on.

Props

Huge thanks to Mike McGinnis for testing and feedback.  :-)

Monday, April 24, 2017

Efficient Blue Coat (and other) Splunk Log Parsing

By Tony Lee

Special Notes

1)  This blog post does not only pertain to Blue Coat logs, but possibly other data sources as well.
2)  This is not a knock on Blue Coat, the app, TA, or any of that, it is just one example of many where we might want to change the way we send data to Splunk.  Fortunately Blue Coat provides the means to do so.  (hat tip)

Background info

A little while back, we were working on a custom Splunk app that included ingesting Blue Coat logs into a SOC's single pane of glass, but we were getting an error message of:

Field extractor name=custom_client_events is unusually slow (max single event time=1146ms)

The Splunk architecture was more than sufficient.  The Blue Coat TA worked great on small instances, but we found that it did not scale to a Blue Coat deployment of this magnitude.  The main reason for this error was the parsing in transforms.conf looked like this:

[custom_client_events]
REGEX = (?<date>[^\s]+)\s+(?<time>[^\s]+)\s+(?<duration>[^\s]+)\s+(?<src_ip>[^\s]+)\s+(?<user>[^\s]+)\s+(?<cs_auth_group>[^\s]+)\s+(?<x_exception_id>[^\s]+)\s+(?<filter_result>[^\s]+)\s+\"(?<category>[^\"]+)\"\s+(?<http_referrer>[^\s]+)\s+(?<status>[^\s]+)\s+(?<action>[^\s]+)\s+(?<http_method>[^\s]+)\s+(?<http_content_type>[^\s]+)\s+(?<cs_uri_scheme>[^\s]+)\s+(?<dest>[^\s]+)\s+(?<uri_port>[^\s]+)\s+(?<uri_path>[^\s]+)\s+(?<uri_query>[^\s]+)\s+(?<uri_extension>[^\s]+)\s+\"(?<http_user_agent>[^\"]+)\"\s+(?<dest_ip>[^\s]+)\s+(?<bytes_in>[^\s]+)\s+(?<bytes_out>[^\s]+)\s+\"*(?<x_virus_id>[^\"]+)\"*\s+\"*(?<x_bluecoat_application_name>[^\"]+)\"*\s+\"*(?<x_bluecoat_application_operation>[^\"]+)

The robustness and volume of data was simply too much for this type of extraction.

Solution

The solution is not to make Splunk adapt, but instead change the way data is sent to it. The Blue Coat app and TA require sending data in the bcreportermain_v1 format--which is an ELFF format. Then the Blue Coat app and TA try to parse this space separated data using the complex regex seen above. Instead of doing that, fortunately you can instruct Blue Coat to send the data in a different format such as key value pair--which Splunk likes and natively parses.

In this case, have the Blue Coat admins define a custom log format with the following fields:

Bluecoat|date=$(date)|time=$(time)|duration=$(time-taken)|src_ip=$(c-ip)|user=$(cs-username)|cs_auth_group=$(cs-auth-group)| x_exception_id=$(x-exception-id)|filter_result=$(sc-filter-result)|category=$(cs-categories)|http_referrer=$(cs(Referer))|status=$(sc-status)|action=$(s-action)|http_method=$(cs-method)|http_content_type=$(rs(Content-Type))|cs_uri_scheme=$(cs-uri-scheme)|dest=$(cs-host)| uri_port=$(cs-uri-port)|uri_path=$(cs-uri-path)|uri_query=$(cs-uri-query)|uri_extension=$(cs-uri-extension)|http_user_agent=$(cs(User-Agent))|dest_ip=$(s-ip)|bytes_in=$(sc-bytes)|bytes_out=$(cs-bytes)|x_virus_id=$(x-virus-id)|x_bluecoat_application_name=$(x-bluecoat-application-name)|x_bluecoat_application_operation=$(x-bluecoat-application-operation)|target_ip=$(cs-ip)|proxy_name=$(x-bluecoat-appliance-name)|proxy_ip=$(x-bluecoat-proxy-primary-address)|$(x-bluecoat-special-crlf)

Since this data comes into Splunk as key=value pair now, Splunk parses it natively.

We just removed the TAs from the indexer and replaced it with a simpler props.conf file of this:

[bluecoat:proxysg:customclient]
SHOULD_LINEMERGE = false

This just turns off line merging which is on by default and makes the parsing even faster. Also remember to rename the props.conf and transforms.conf (ex: .bak files) included in the app if you have it installed on your search head--that contains the same complicated regex which will slow down data ingestion. Lastly, by defining your own format, you can add other fields you care about--such as the target IP (cs-ip) which is not included in the default bcreportermain_v1 format for some reason. We hope this helps others that run into this situation.

Conclusion

Again, this issue is not isolated to Blue Coat, but to any data source that has the ability to change the way it sends data. We were quite happy to find that Blue Coat provides that ability and it certainly reduced the load on the entire system and gave back those resources for adding other data.  Hat tip to Blue Coat for providing the flexibility of custom log formats.  Happy Splunking!