Welcome to the fifth article of the Spelunking your Splunk series, all designed to help you understand your Splunk environment at a quick glance. Here is a quick recap of the previous articles:
- Spelunking your Splunk Part I (Exploring Your Data) - A clever dashboard that can be used to quickly understand the indexes, sources, sourcetypes, and hosts in any Splunk environment.
- Spelunking your Splunk – Part II (Disk Usage) - A dashboard that can be used to monitor data distribution across multiple indexers.
- Spelunking your Splunk – Part III (License Usage) - A dashboard to understand license usage over time.
- Spelunking your Splunk – Part IV (User Metrics) - A dashboard to provide insight into user activity
This article focuses on understanding your Splunk environment at a high-level. Have you ever wondered the following?
- How many events ingested over a user-defined time period
- How that equates to events per second (EPS)
- The distinct host count
- Number of indexes with data
- Number of sourcetypes
- Number of sources
- Visually what the data ingest looks like by total event count and by index
This dashboard will give it to you and do it fast! As a bonus we will provide the dashboard code at the end of the article.
Figure 1: Splunk Stats dashboard |
Finding detailed index information quickly
There are at least two places within Splunk to discover index information. The first uses a RESTful call and provides detailed information about indexes. The second requires more calculation and is less efficient. For this exercise, lets try copying and pasting the following RESTful search into your Splunk search bar to see what data is returned:| rest /services/data/indexes-extended
Figure 2: Results of the restful search (remember to scroll right) |
| dbinspect index=*
Figure 3: Column headers from dbinspect (remember to scroll right) |
Now try the following which combines both (thank you Splunk!):
| dbinspect index=* cached=t
| where NOT match(index, "^_")
| stats max(rawSize) AS raw_size max(eventCount) AS event_count BY bucketId, index
| stats sum(raw_size) AS raw_size sum(event_count) AS event_count dc(bucketId) AS buckets BY index
| eval raw_size_gb = round(raw_size / 1024 / 1024 / 1024 , 2) | fields index raw_size_gb event_count buckets
| join type=outer index [| rest /services/data/indexes-extended
| table title maxTime minTime frozenTimePeriodInSecs
| eval minTime = case(minTime >= "0", minTime)
| stats max(maxTime) AS maxTime min(minTime) AS minTime max(frozenTimePeriodInSecs) AS retention BY title
| eval maxTime = replace(maxTime, "T", " "), maxTime = replace(maxTime, "\+0000", ""), minTime = replace(minTime, "T", " "), minTime = replace(minTime, "\+0000", ""), retention = round(retention / 86400, 0)." Days"
| rename title AS index] | fields index raw_size_gb event_count buckets minTime maxTime retention
| rename raw_size_gb AS "Index Size (GB)" event_count AS "Total Event Count" buckets AS "Total Bucket Count" minTime AS "Earliest Event" maxTime AS "Latest Event" retention AS Retention
Now that you understand the basics, the sky is the limit. :-)
Finding source, sourcetype, and host data quickly
You may remember from the first article of this series (Spelunking your Splunk Part I (Exploring Your Data) called tstats. In a nutshell, tstats can perform statistical queries on indexed fields—very very quickly. These indexed fields by default are index, source, sourcetype, and host. It just so happens that these are the fields that we need to understand the environment. Best of all, even on an underpowered environment or one with lots of data ingested per day, these commands will still outperform the rest of your typical searches even over long periods of time. This works great for our dashboard!
Conclusion
Splunk provides decent visibility into various features within Monitoring Console / DMC (Distributed management console), but we found this flexible and customizable dashboard to be quite helpful for gaining additional insight. We hope this helps you too. Enjoy!Dashboard XML code
Below is the dashboard code needed to enumerate your Splunk stats. Feel free to modify the dashboard as needed:
<form>
<label>Splunk Stats</label>
<fieldset submitButton="true" autoRun="true">
<input type="time" token="time">
<label>Time Range Selector</label>
<default>
<earliest>-7d@h</earliest>
<latest>now</latest>
</default>
</input>
</fieldset>
<row>
<panel>
<single>
<title>Distinct Events</title>
<search>
<query>| tstats count where index=*</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
<sampleRatio>1</sampleRatio>
</search>
<option name="drilldown">none</option>
<option name="refresh.display">progressbar</option>
</single>
</panel>
<panel>
<single>
<title>Events Per Second (EPS)</title>
<search>
<query>| tstats count where index=* | addinfo | eval diff = info_max_time - info_min_time | eval EPS = count / diff | table EPS</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<option name="drilldown">none</option>
</single>
</panel>
<panel>
<single>
<title>Distinct Hosts</title>
<search>
<query>| tstats dc(host) where index=*</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
<sampleRatio>1</sampleRatio>
</search>
<option name="drilldown">none</option>
<option name="refresh.display">progressbar</option>
</single>
</panel>
<panel>
<single>
<title>Distinct Indexes with Data</title>
<search>
<query>| dbinspect index=* cached=t
| where NOT match(index, "^_")
| stats max(rawSize) AS raw_size max(eventCount) AS event_count BY bucketId, index
| stats sum(raw_size) AS raw_size sum(event_count) AS event_count dc(bucketId) AS buckets BY index
| eval raw_size_gb = round(raw_size / 1024 / 1024 / 1024 , 2) | fields index raw_size_gb event_count buckets
| join type=outer index [| rest /services/data/indexes-extended
| table title maxTime minTime frozenTimePeriodInSecs
| eval minTime = case(minTime >= "0", minTime)
| stats max(maxTime) AS maxTime min(minTime) AS minTime max(frozenTimePeriodInSecs) AS retention BY title
| eval maxTime = replace(maxTime, "T", " "), maxTime = replace(maxTime, "\+0000", ""), minTime = replace(minTime, "T", " "), minTime = replace(minTime, "\+0000", ""), retention = round(retention / 86400, 0)." Days"
| rename title AS index] | fields index raw_size_gb event_count buckets minTime maxTime retention
| rename raw_size_gb AS "Index Size (GB)" event_count AS "Total Event Count" buckets AS "Total Bucket Count" minTime AS "Earliest Event" maxTime AS "Latest Event" retention AS Retention | stats count</query>
<earliest>0</earliest>
<latest></latest>
<sampleRatio>1</sampleRatio>
</search>
<option name="drilldown">none</option>
<option name="refresh.display">progressbar</option>
</single>
</panel>
<panel>
<single>
<title>Distinct Sourcetypes</title>
<search>
<query>| tstats dc(sourcetype) where index=*</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
<sampleRatio>1</sampleRatio>
</search>
<option name="drilldown">none</option>
<option name="refresh.display">progressbar</option>
</single>
</panel>
<panel>
<single>
<title>Distinct Sources</title>
<search>
<query>| tstats dc(source) where index=*</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
<sampleRatio>1</sampleRatio>
</search>
<option name="drilldown">none</option>
<option name="refresh.display">progressbar</option>
</single>
</panel>
</row>
<row>
<panel>
<chart>
<title>Total Event Count Over Time</title>
<search>
<query>| tstats prestats=t count where index=* by _time | timechart count</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<option name="charting.chart">area</option>
<option name="charting.drilldown">none</option>
<option name="refresh.display">progressbar</option>
</chart>
</panel>
<panel>
<chart>
<title>Event Count by Index Over Time</title>
<search>
<query>| tstats prestats=t count where index=* by index, _time | timechart count by index</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<option name="charting.chart">area</option>
<option name="charting.drilldown">none</option>
<option name="refresh.display">progressbar</option>
</chart>
</panel>
</row>
<row>
<panel>
<table>
<title>Indexes with Data</title>
<search>
<query>| dbinspect index=* cached=t
| where NOT match(index, "^_")
| stats max(rawSize) AS raw_size max(eventCount) AS event_count BY bucketId, index
| stats sum(raw_size) AS raw_size sum(event_count) AS event_count dc(bucketId) AS buckets BY index
| eval raw_size_gb = round(raw_size / 1024 / 1024 / 1024 , 2) | fields index raw_size_gb event_count buckets
| join type=outer index [| rest /services/data/indexes-extended
| table title maxTime minTime frozenTimePeriodInSecs
| eval minTime = case(minTime >= "0", minTime)
| stats max(maxTime) AS maxTime min(minTime) AS minTime max(frozenTimePeriodInSecs) AS retention BY title
| eval maxTime = replace(maxTime, "T", " "), maxTime = replace(maxTime, "\+0000", ""), minTime = replace(minTime, "T", " "), minTime = replace(minTime, "\+0000", ""), retention = round(retention / 86400, 0)." Days"
| rename title AS index] | fields index raw_size_gb event_count buckets minTime maxTime retention
| rename raw_size_gb AS "Index Size (GB)" event_count AS "Total Event Count" buckets AS "Total Bucket Count" minTime AS "Earliest Event" maxTime AS "Latest Event" retention AS Retention</query>
<earliest>0</earliest>
<latest></latest>
</search>
<option name="drilldown">none</option>
<option name="refresh.display">progressbar</option>
</table>
</panel>
</row>
</form>
No comments:
Post a Comment