This series on osquery will take us on a journey from stand-alone agents, to managing multiple agents with Kolide Fleet, and then finally onto more advanced integrations and analysis. We already covered the following topics:
Part I - Local Agent Interaction: http://securitysynapse.blogspot.com/2019/05/osquery-part-i-local-agent-interaction.html
Part II - Kolide Centralized Management: http://securitysynapse.blogspot.com/2019/05/osquery-part-ii-kolide-centralized.html
Part III - Queries and Packs: http://securitysynapse.blogspot.com/2019/05/osquery-part-iii-queries-and-packs.html
Part IV - Fleet Control Using fleetctl - http://securitysynapse.blogspot.com/2019/05/osquery-part-iv-fleet-control-using-fleetctl.html
Even though we now have a centralized management platform that we can manage, reading the query output in the Kolide Fleet UI does not scale to hundreds of thousands of hosts -- thus we need to integrate with a big data analytics platform so we can stack and perform statistical analysis on the data. In this article, we will examine Kolide Fleet output + Splunk integration. As a bonus, we are releasing a Kolide Fleet App for Splunk -- free of charge in Splunkbase. In the first version of the app, it will be able to parse, normalize, and display the following information:
- Overview information
- Status Log
- osquery_info query
- programs query
- process_open_sockets query
- users query
Figure 1: Overview page |
Figure 2: Status Log |
Figure 3: osquery_info page |
Expected Kolide Packs and Queries
The first version of the Kolide App for Splunk needs the pack, query names, and output to conform to what is shown in the fleetctl get commands below. For this reason, we are sharing our exported packs and queries here:Remember in Part IV of this series, we covered how to import the queries and packs using fleetctl:
[+] applied 4 queries
[+] applied 4 packs
Pack and Query details
fleetctl get p
+---------------------------+----------+-------------------------------+
| NAME | PLATFORM | DESCRIPTION |
+---------------------------+----------+-------------------------------+
| users pack | | Query all users |
+---------------------------+----------+-------------------------------+
| osquery_info pack | | Query the version of osquery |
+---------------------------+----------+-------------------------------+
| process_open_sockets pack | | Pack for process_open_sockets |
+---------------------------+----------+-------------------------------+
| programs pack | | pack for programs |
+---------------------------+----------+-------------------------------+
fleetctl get q
+----------------------------+------------------------------+--------------------------------+
| NAME | DESCRIPTION | QUERY |
+----------------------------+------------------------------+--------------------------------+
| users query | Query all users | SELECT * FROM users |
+----------------------------+------------------------------+--------------------------------+
| osquery_info query | Query the version of osquery | SELECT * FROM osquery_info |
+----------------------------+------------------------------+--------------------------------+
| process_open_sockets query | Query process_open_sockets | SELECT DISTINCT proc.name, |
| | | proc.path, proc.cmdline, |
| | | pos.pid, pos.protocol, |
| | | pos.local_address, |
| | | pos.local_port, |
| | | pos.remote_address, |
| | | pos.remote_port FROM |
| | | process_open_sockets AS pos |
| | | JOIN processes AS proc ON |
| | | pos.pid = proc.pid; |
+----------------------------+------------------------------+--------------------------------+
| programs query | query for programs | SELECT * FROM programs |
+----------------------------+------------------------------+--------------------------------+
Kolide Output
Once the Packs and Queries above are imported using the fleetctl apply command above, applied to targets, and scheduled to run, we need to gather the output and send it to Splunk. You might remember that in Part III of this series, we mentioned that we added a statement to our fleet.yaml configuration file to send the results and status output to the following path with log rotation:filesystem:
status_log_file: /data/osquery/status.log
result_log_file: /data/osquery/results.log
enable_log_rotation: true
This sets us up perfectly to use a Splunk forwarder to send the data to Splunk. If not already completed, download and install the Splunk forwarder here:
https://www.splunk.com/en_us/download/universal-forwarder.html
Once installed, configure the forwarder to send data to your indexers.
Install the Splunk App and Create the Index
In order to prepare for the data's arrival, we now install the Splunk app and create an index for the osquery data:1) Install the following Kolide Fleet App For Splunk: https://splunkbase.splunk.com/app/4518/
2) Create an index called osquery
If you already had osquery data going to Splunk to a different index and sourcetype, all is not lost. You can modify the eventtypes.conf file to account for it.
(Optional) Modify eventtypes.conf as Needed
If you already had Kolide setup and sending data to Splunk under a different index and sourcetype name, that's not a problem. As long as the data is being parsed correctly, we can just modify eventtypes.conf within Splunk to still make all the dashboards function correctly for your index and sourcetype names. Modify index=osquery to match your index. Modify sourcetype=osquery:results and osquery:status to match your sourcetypes.cat eventtypes.conf
[osquery_index]
search = index=osquery
[osquery_status]
search = eventtype=osquery_index sourcetype=osquery:status
[osquery_results]
search = eventtype=osquery_index sourcetype=osquery:results
(Optional) Modify props.conf as Needed
Currently the only two stanzas in props.conf that are used are osquery:results and osquery:status shown below. Feel free to change the stanza to match the sourcetype if needed. Minimal parsing is accomplished below:
cat props.conf
## Results log
[osquery:results]
TRUNCATE = 50000
KV_MODE = json
SHOULD_LINEMERGE = 0
category = osquery
pulldown_type = 1
MAX_TIMESTAMP_LOOKAHEAD = 10
TIME_FORMAT = %s
TIME_PREFIX = unixTime\"\:
EVAL-vendor_product = "osquery"
FIELDALIAS-user = decorations.username as user
FIELDALIAS-username = username as user
FIELDALIAS-dest = decorations.hostname as host
## Status log
[osquery:status]
KV_MODE = json
SHOULD_LINEMERGE = 0
category = osquery
pulldown_type = 1
MAX_TIMESTAMP_LOOKAHEAD = 10
TIME_FORMAT = %s
TIME_PREFIX = unixTime\"\:
EVAL-vendor_product = "osquery"
FIELDALIAS-user = decorations.username as user
FIELDALIAS-dest = host as dest
Send logs to Splunk via Splunk forwarder (inputs.conf)
Once our app is installed on the search head, Splunk forwarder is installed on the Kolide host and Kolide is writing the status and results logs to disk, we need to let the fowarder know where to gather the logs. For this, we use the following inputs.conf file:[monitor:///data/osquery/results.log]
index = osquery
sourcetype = osquery:results
disabled = 1
[monitor:///data/osquery/status.log]
index = osquery
sourcetype = osquery:status
disabled = 1
If all went as planned, you should see data populating in the Splunk app. :-)
Conclusion
This article covered how to import the required queries to populate the current version of the Splunk app. It then explained where the Kolide Fleet logs should appear and how to forward those logs to Splunk. We covered installing the newly created Kolide Fleet App for Splunk and optionally configure the eventtypes.conf and/or props.conf for any deviation in expected index or sourcetype. At the end of this effort, you should have data flowing from Kolide Fleet to Splunk properly ingested, parsed, and displayed. For any questions, please post in the comments section below. Otherwise, stay tuned additional integration efforts!Props to the osquery TA for getting us started.
Bonus for the curious reader -- Splunk Magic
Normally, JSON is not the prettiest of data to table in Splunk. However, we discovered a series of tricks that makes panel and dashboard development scale a little easier. Our searches in many cases end up looking something like this:eventtype=osquery_results name="pack/network_connection_listening/Windows_Process_Listening_Port" | dedup host, _time | spath output=data path=snapshot{} | mvexpand data | rename data as _raw | extract pairdelim="," kvdelim=":" | eval pname=mvindex(name,1) | table _time, host, pname, path, protocol, address, port
It is a lot of digest all at once, so let's break it down:
- Find the data we want: eventtype=osquery_results name="pack/network_connection_listening/Windows_Process_Listening_Port"
- Get the latest result by host and time: dedup host, _time
- Convert multivalue field into data field, one per line kv pair: spath output=data path=snapshot{}
- Expand multivaue fields into separate events: mvexpand data
- Rename data as _raw since extract only works on _raw: rename data as _raw
- Extract key/value pair (regardless of key names): extract pairdelim="," kvdelim=":"
- Avoid conflict for "name" variable: eval pname=mvindex(name,1)
- Table remaining values extracted: table <extracted field names>
Figure 4: Pure joy of JSON data in Splunk |
Figure 5: Data is ready to table after our SPL trickery |