Sunday, July 8, 2018

Spelunking your Splunk – Part IV (User Metrics)

By Tony Lee

Welcome to the fourth article of the Spelunking your Splunk series, all designed to help you understand your Splunk environment at a quick glance.  Here is a quick recap of the previous articles:



This article will focus on understanding the users within the environment--even when spread over a search head cluster. We will show you that it is possible to check the amount of concurrent Splunk users, how much they are searching, successful and failed logins and aged accounts. This information is useful not only from an accountability perspective, but also from a resource perspective. When a search head (or cluster) becomes overloaded with users, it may be a good time to consider horizontal scaling.

Finding and understanding user information

There are at least two places within Splunk to discover user information. The first requires a RESTful call and provides information about authenticated users. The second is a search against the _audit index filtering on user activity. Try copying and pasting the following two searches into your Splunk search bar one at a time to see what data is returned:

| rest /services/authentication/httpauth-tokens splunk_server=*

Figure 1:  Current authenticated users via httpauth-tokens


index=_audit user=*

Figure 2:  _audit index with a focus on user activity

Now that you understand the basics, the sky is the limit. You can audit each user or display the statistics for all users. Take a look at our dashboard below to see what is possible. If you find it useful, we provide the code for it at the bottom of this article. Give it a try and let us know what you think.

Figure 3:  User Metrics dashboard with all panels



Conclusion

Splunk provides decent visibility into various features within Monitoring Console / DMC (Distributed management console), but we found this flexible and customizable dashboard to be quite helpful for monitoring gaining additional insight.  We hope this helps you too.  Enjoy!


Dashboard XML code


Below is the dashboard code needed to enumerate your user metrics.  Feel free to modify the dashboard as needed:

<form>
  <label>User Metrics</label>
  <description>Displays Interesting Usage Metrics</description>
  <!-- Add time range picker -->
  <fieldset autoRun="True">
    <input type="time" searchWhenChanged="true">
      <default>
        <earliestTime>-24h@h</earliestTime>
        <latestTime>now</latestTime>
      </default>
    </input>
    <input type="text" token="wild">
      <label>Search</label>
      <default>*</default>
      <suffix/>
    </input>
  </fieldset>
  <row>
    <panel>
      <chart>
        <title>Current Active Users</title>
        <search>
          <query>| rest /services/authentication/httpauth-tokens splunk_server=* | where NOT userName="splunk-system-user" | stats dc(userName) AS "Total Users"</query>
          <earliest>$earliest$</earliest>
          <latest>$latest$</latest>
        </search>
        <option name="charting.axisLabelsX.majorLabelStyle.overflowMode">ellipsisNone</option>
        <option name="charting.axisLabelsX.majorLabelStyle.rotation">0</option>
        <option name="charting.axisTitleX.visibility">visible</option>
        <option name="charting.axisTitleY.visibility">visible</option>
        <option name="charting.axisTitleY2.visibility">visible</option>
        <option name="charting.axisX.scale">linear</option>
        <option name="charting.axisY.scale">linear</option>
        <option name="charting.axisY2.enabled">false</option>
        <option name="charting.axisY2.scale">inherit</option>
        <option name="charting.chart">fillerGauge</option>
        <option name="charting.chart.nullValueMode">gaps</option>
        <option name="charting.chart.sliceCollapsingThreshold">0.01</option>
        <option name="charting.chart.stackMode">default</option>
        <option name="charting.chart.style">shiny</option>
        <option name="charting.drilldown">all</option>
        <option name="charting.layout.splitSeries">0</option>
        <option name="charting.legend.labelStyle.overflowMode">ellipsisMiddle</option>
        <option name="charting.legend.placement">right</option>
      </chart>
    </panel>
    <panel>
      <table>
        <title>Current Logged in Users</title>
        <search>
          <query>| rest /services/authentication/httpauth-tokens splunk_server=* | where NOT userName ="splunk-system-user" | stats max(timeAccessed) AS "Latest Activity" by userName | rename userName AS "User" | sort -"Latest Activity"</query>
          <earliest>$earliest$</earliest>
          <latest>$latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="rowNumbers">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Total Searches</title>
        <search>
          <query>index=_audit user=* (action="search" AND info="granted") | where NOT user ="splunk-system-user" | stats count(action) AS Searches by user | sort - Searches</query>
          <earliest>$earliest$</earliest>
          <latest>$latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="rowNumbers">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <table>
        <title>Successful Logins</title>
        <search>
          <query>index=_audit user=* (action="login attempt" AND info="succeeded") | stats count(action) AS Logins by user | rename user AS User, Logins AS Successes | sort - Successes</query>
          <earliest>$earliest$</earliest>
          <latest>$latest$</latest>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="rowNumbers">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Failed Logins</title>
        <search>
          <query>index=_audit user=* (action="login attempt" AND info="failed") | stats count(action) AS Logins by user | rename user AS User, Logins AS Failures | sort - Failures</query>
          <earliest>0</earliest>
          <latest></latest>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="rowNumbers">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
    <panel>
      <table>
        <title>Aged Accounts (15 days or older)</title>
        <search>
          <query>index=_audit user=* (action="login attempt" AND info="succeeded") | dedup user | eval age_days=round((now()-_time)/(60*60*24)) | where age_days &gt;= 15 | eval time=strftime(_time, "%m/%d/%Y %H:%M:%S") | table user, time, age_days | sort -age_days</query>
          <earliest>-15d@d</earliest>
          <latest>now</latest>
        </search>
        <option name="wrap">true</option>
        <option name="rowNumbers">false</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="count">10</option>
      </table>
    </panel>
  </row>
</form>