Sunday, October 15, 2017

Spelunking your Splunk – Part I

By Tony Lee


Have you ever inherited a Splunk instance that you did not build?  This means that you probably have no idea what data sources are being sent into Splunk.  You probably don’t know much about where the data is being stored.  And you certainly do not know who the highest volume hosts are within the environment.

As a consultant, this is reality for nearly every engagement we encounter:  We did not build the environment and documentation is sparse or inaccurate if we are lucky enough to even have it.  So, what do we do?  We could run some fairly complex queries to figure this out, but many of those queries are not efficient enough to search over vast amounts of data or long periods of time—even on highly optimized environments.  All is not lost though, we have some tricks (and a handy dashboard) that we would like to share.

Note:  Maybe you did build the environment, but you need a sanity check to make sure you don’t have any misconfigured or run-away hosts.  You will also find value here.

tstats to the rescue!

If you have not discovered or used the tstats command, we recommend that you become familiar with it even if it is at a very high-level.  In a nutshell, tstats can perform statistical queries on indexed fields—very very quickly.  These indexed fields by default are index, source, sourcetype, and host.  It just so happens that these are the fields that we need to understand the environment.  Best of all, even on an underpowered environment or one with lots of data ingested per day, these commands will still outperform the rest of your typical searches even over long periods of time.  Ok, time to answer some questions!

Common questions

These are common questions we ask during consulting engagements and this is how we get answers FAST.  Most of the time 7 days’ worth of data is enough to give us a good understanding of the environment and week out anomalies.

How many events are we ingesting per day?
| tstats count where index=* by _time

Figure 1:  Events per day

What are my most active indexes (events per day)?
| tstats prestats=t count where index=* by index, _time span=1d | timechart span=1d count by index

Figure 2:  Most active indexes

What are my most active sourcetypes (events per day)?
| tstats prestats=t count where index=* by sourcetype, _time span=1d | timechart span=1d count by sourcetype

Figure 3:  Most active sourcetypes

What are my most active sources (events per day)?
| tstats prestats=t count where index=* by source, _time span=1d | timechart span=1d count by source

Figure 4:  Most active sources

What is the noisiest host (events per day)?
| tstats prestats=t count where index=* by host, _time span=1d | timechart span=1d count by host

Figure 5:  Most active hosts

Dashboard Code

To make things even easier for you, try this dashboard out (code at the bottom) that combines the searches we provided above and as a bonus adds a filter to specify the index and time range.

Figure 6:  Data Explorer dashboard


Splunk is a very powerful search platform but it can grow to be a complicated beast--especially over time.  Feel free to use the searches and dashboard provided to regain control and really understand your environment.  This will allow you to trim the waste and regain efficiency.  Happy Splunking.

Dashboard XML code is below:

  <label>Data Explorer</label>
  <fieldset submitButton="true" autoRun="true">
    <input type="time" token="time">
      <label>Time Range Selector</label>
    <input type="text" token="index">
        <title>Most Active Indexes</title>
          <query>| tstats prestats=t count where index=$index$ by index, _time span=1d | timechart span=1d count by index</query>
        <option name="charting.chart">column</option>
        <option name="charting.drilldown">none</option>
        <title>Most Active Sourcetypes</title>
          <query>| tstats prestats=t count where index=$index$ by sourcetype, _time span=1d | timechart span=1d count by sourcetype</query>
        <option name="charting.chart">column</option>
        <option name="charting.drilldown">none</option>
        <title>Most Active Sources</title>
          <query>| tstats prestats=t count where index=$index$ by source, _time span=1d | timechart span=1d count by source</query>
        <option name="charting.chart">column</option>
        <option name="charting.drilldown">none</option>
        <title>Most Active Hosts</title>
          <query>| tstats prestats=t count where index=$index$ by host, _time span=1d | timechart span=1d count by host</query>
        <option name="charting.chart">column</option>
        <option name="charting.drilldown">none</option>

Wednesday, September 13, 2017

Splunk Technology Add-on (TA) Creation Script

By Tony Lee


If you develop a Splunk application, at some point you may find yourself needing a Technology Add-on (TA) to accompany the app. Essentially, the TA utilizes much of the app's files, except for the user interface (UI/views). TA's are typically installed on indexers and heavy forwarders to process incoming data. Splunk briefly covers the difference between as app and an add-on in the link below:

Maintaining two codebases can be time consuming though. Instead, it is possible to develop one application and extract the necessary components to build a TA. There may be other solutions such as the Splunk Add-on Builder ( , but I found this script below to be one of the easiest methods.


This could be written in any language, however my development environment is Linux-based. The quickest and easiest solution was to write the script using bash. Feel free to translate it to another language if needed though.


Usage is simple.  Just supply the name of the application and it will create the TA from the existing app.

The app should be located here (if not, change the APP_HOME variable in the script):


Copy and paste the bash shell script ( below to the /tmp directory and make it executable:

chmod +x /tmp/

Then run the script from the tmp directory and supply the application name: <AppName>

Ex: cylance_protect

Once complete, the TA will be located here:  /tmp/TA-<AppName>.spl


# Create-TA
# anlee2 - at -
# TA Creation tool written in bash
# Input:  App name   (ex: cylance_protect)
# Output: /tmp/TA-<app name>.spl

# Path to the Splunk app home.  Change if this is not accurate.

##### Function Usage #####
# Prints usage statement
echo "TA-Create v1.0
Usage: <App name>

  -h = help menu

Please report bugs to"

# Detect the absence of command line parameters.  If the user did not specify any, print usage statement
[[ $# -eq 0 || $1 == "-h" ]] && { Usage; exit 0; }

# Set the app name and TA name based on user input

echo -e "\nApp name is:  $APP_NAME\n"

echo -e "Creating directory structure under /tmp/$TA_NAME\n"
mkdir -p /tmp/$TA_NAME/default /tmp/$TA_NAME/metadata /tmp/$TA_NAME/lookups /tmp/$TA_NAME/static /tmp/$TA_NAME/appserver/static

echo -e "Copying files...\n"
cp $APP_HOME/$APP_NAME/default/eventtypes.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/app.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/props.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/tags.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/default/transforms.conf /tmp/$TA_NAME/default/ 2>/dev/null
cp $APP_HOME/$APP_NAME/static/appIcon.png  /tmp/$TA_NAME/static/appicon.png 2>/dev/null
cp $APP_HOME/$APP_NAME/static/appIcon.png  /tmp/$TA_NAME/appserver/static/appicon.png 2>/dev/null
cp $APP_HOME/$APP_NAME/README /tmp/$TA_NAME/ 2>/dev/null
cp $APP_HOME/$APP_NAME/lookups/* /tmp/$TA_NAME/lookups/ 2>/dev/null

echo -e "Modifying app.conf...\n"
sed -i s/$APP_NAME/$TA_NAME/g /tmp/$TA_NAME/default/app.conf
sed -i "s/is_visible = .*/is_visible = false/g" /tmp/$TA_NAME/default/app.conf
sed -i "s/description = .*/description = TA for $APP_NAME./g" /tmp/$TA_NAME/default/app.conf
sed -i "s/label = .*/label = TA for $APP_NAME./g" /tmp/$TA_NAME/default/app.conf

echo -e "Creating default.meta...\n"
cat >/tmp/$TA_NAME/metadata/default.meta <<EOL
# Application-level permissions
access = read : [ * ], write : [ admin, power ]
export = system

export = system

export = system

export = system

export = system

### VIEWSTATES: even normal users should be able to create shared viewstates
access = read : [ * ], write : [ * ]
export = system

cd /tmp; tar -zcf TA-$APP_NAME.spl $TA_NAME

echo -e "Finished.\n\nPlease check for you file here:  /tmp/$TA_NAME.spl"


Hopefully this helps others save some time by maintaining one application and extracting the necessary data to create the technology add-on.


Huge thanks to Mike McGinnis for testing and feedback.  :-)

Sunday, August 27, 2017

Splunk: The unsung hero of creative mainframe logging

By Tony Lee

The situation

Have you ever, in your life, heard a good sentence that started with: “So, we have this mainframe... that has logging and compliance requirements…” Yeah, me neither. But this was a unique situation that required a quick and creative solution--and it needed to be done yesterday.  Queue the horror music.

In summary:  We needed to quickly log and make sense of mainframe data for reporting and compliance reasons. The mainframe did not support external logging such as syslog. However, the mainframe could produce a CSV file and that file could be scheduled to upload to an FTP server (Not SFTP, FTPS, or SCP).  Yikes!

Possible solutions

We could stand up an FTP server and use the Splunk Universal forwarder to monitor the FTP upload directory, but we did not have extra hardware or virtual capacity readily available. After a quick Google search, we ran across this little gem of an app called the Splunk FTP Reviver app (written by Luke Murphey): This app cleverly creates a python FTP server using Splunk—best of all, it leverages Splunk’s user accounts and role-based access controls.

How it worked

At a high level, here are the steps involved:
  1. Install the FTP Receiver app:
  2. Create an index for the mainframe data (Settings -> Indexes -> New -> Name: mainframe)
  3. Create an FTP directory for the uploaded files (mkdir /opt/splunk/ftp)
  4. Create FTP Data input (Settings -> Data Inputs -> Local Inputs -> FTP -> New -> name: mainframe, port: 2121, path: ftp, sourcetype: csv, index: mainframe)
  5. Create a role with the ftp_write privileges (Settings -> Access Controls -> Roles: Add new -> Name: ftp_write, Capabilities: ftp_write)
  6. Create a Splunk user for the FTP Receiver app (Settings -> Access Controls -> Users: Add new -> Name: mainframe, Assign to roles: ftp_write)
  7. Configure the mainframe to send to the FTP Receiver app port (on your own for that one)
  8. Create a local data input to monitor the FTP upload directory and ingest as CSV (Settings -> Data inputs -> Local inputs -> Files and Directories -> New -> Browse to /opt/splunk/ftp -> Continuously monitor -> Sourcetype: csv, index: mainframe)

Illustrated, the solution looks like this:

Figure 1:  Diagram of functional components

If you run into any issues, troubleshoot and confirm that the FTP server is working via a common web browser.

Figure 2:  Troubleshooting with the web browser


Putting aside concerns that the mainframe may be older than most of the IT staff and the fact that FTP is still a clear-text protocol, this was an interesting solution that was created using the flexibility of Splunk. Add some mitigating controls and a little bit of SPL + dashboard design and it may be the easiest and most powerful mainframe reporter in existence.

Figure 3:  Splunk rocks, the process works

Thursday, August 17, 2017

CyBot – Threat Intelligence Chat Bot

By Tony Lee


For those who could not attend, this year’s Black Hat security conference did not disappoint.  It was an awesome time to collaborate and share with the security community.  In doing so, we open sourced a new tool at Black Hat Arsenal at aimed at assisting Security Operations Centers (SOCs) and digital first responders.  We affectionately call it:  CyBot – Threat Intelligence Chat Bot.

We understand that typical SOC environments face a number of challenges:
  • Many SOCs are overwhelmed with the number of incoming alerts
  • Service Level Agreements (SLAs) often define a maximum time to investigate and contain an incident
  • Security tools may be plentiful but they are often not centralized
  • Collaboration on a large investigation may be challenging

We have even seen cases where the SOC receives so many alerts that all of them may not be properly investigated.

To combat this, CyBot can be your threat intelligence chat bot waiting to do research for you.  For example, instead of going to various websites or dashboards to perform research, you could just ask CyBot simple questions and even share results with other investigators.  All from within one chat window you can do the following and more:
·         Ask about the threat reputation of URLs and hashes
·         Perform WHOIS, nslookup, and geoip lookups
·         Unshorten potentially malicious shortened URLs
·         Extract links from a potentially malicious website

CyBot Menu

Best of all, this capability is now free and being actively developed.  All documentation, slides, and plugins have been made publicly available via github:


Very few tasks are ever accomplished in complete isolation.  Tools, services, and ideas were combined from awesome places such as:
  • Errbot developers for the fantastic tool and customer service
  • VirusTotal
  • geoip -
  • Google Safebrowsing - and Jun C. Valdez Hashid - C0re
  • Unshorten -
  • Codename - Black Hat Arsenal team for the amazing support and tool release venue

  • Non-bots: Bill Hau, Corey White, Dennis Hanzlik, Ian Ahl, Dave Pany, Dan Dumond, Kyle Champlin, Kierian Evans, Andrew Callow, Mark Stevens