Monday, September 28, 2020

Fun with Microsoft Power BI - Part I - Intro

 By Tony Lee

If you have read some of our other articles you can probably tell by now that we enjoy making data actionable. Honestly, it doesn't matter what type of data or even where the data ends up. As long as we can make informed decisions using the data -- we love it. Following in this theme we are going to make BlackBerry (formerly known as Cylance) Protect Threat Data Report (TDR) CSVs actionable using Power BI and Power BI Desktop. Sure, we could have used excel and some charts here and there, but Power BI is a more suitable fit to creating reusable, decision maker ready, reports. You can use any data source to follow along in this series, but our example BlackBerry Protect report is shown below which we will happily share the Power BI file at the end of the series for you to load and analyze your own data, so stay tuned for that!

Figure 1:  Our Power BI report using BlackBerry Protect TDR data

In this first article, we will cover:

  • Getting Started
  • Data Ingest
  • Adjusting Fields
  • Visualizations
  • Saving Your Work

Getting Started

There are many options for using Microsoft's Power BI which are associated with varying costs and features.  As a high-level overview:

  • Power BI Desktop - Free thick client which can be used to ingest data and design reports
  • Power BI Pro - $9.99 monthly per user pricing (included in E5 license)
  • Power BI Premium - $4,995 monthly pricing - Enterprise BI big data analytics
  • Power BI Mobile - iOS, Android, HoloLens, PC, Mobile Device, Hub apps
  • Power BI Embedded - Analytics and visualizations tailored for embedded applications
  • Power BI Report Server - On-premises reporting solution, included in premium and can provide hybrid on-prem and cloud capabilities

Source:  https://powerbi.microsoft.com/en-us/pricing/

For our learning purposes we used Power BI Desktop to develop our report and Power BI (https://powerbi.microsoft.com) to display it (full screen) in our private workspace. We appear to be using a "free" cloud account and did not upgrade to Pro.

Note:  You cannot use a personal account to sign into Power BI.  You must use a work or school account.  Chances are you probably have one of these accounts and it has some (even free) access to Power BI.

Figure 2:  Power BI ecosystem - Source:  https://docs.microsoft.com/en-us/learn/modules/get-started-with-power-bi/1-introduction


Data Ingest

Now that you have downloaded Power BI Desktop, we need to ingest data. As mentioned at the start of the article, we are using BlackBerry Protect TDR data which is downloaded from the BlackBerry/Cylance portal in a CSV format. Once the data set is downloaded, in Power BI Desktop, click Home > Get Data > Text/CSV and navigate to the file.

Figure 3:  Many options for loading data

Power BI Desktop did a great job parsing the data in columns with the appropriate header. It even tries to detect the type of data such as string vs. number vs. date.

Figure 4:  Parsing of fields

Adjusting Fields

You should now see the fields on the right hand side of the canvas. Note that there may be some instances in which Power BI takes the liberty in summarizing your data--sometimes this is helpful and something it does not make sense to humans. This is understandable since it still takes a human to determine the context around various data fields. A good example is provided when Power BI Desktop tried to sum the BlackBerry/Cylance Protect scores which is of no real value to analysts.  "A" for effort though and at least it is correctable by clicking on the parsed field on the right > Column tools > Summarization > Don't summarize. 

Figure 5:  Adjusting the parsed fields

Don't worry about trying to find all of the misinterpreted data up front. You will discover some of these as we start creating visualizations in our report.

Note:  Power BI prefers Columnar data, thus spreadsheets that are appealing to the human eye are not always interpreted correctly by Power BI. This level of transforming and manipulating will be left to another article.

Visualizations

What may be most impressive about Power BI is the amount of visualizations available by default. These include (but are not limited to):

  • Area charts
  • Bar and column charts
  • Cards (numeric value)
  • Combo charts
  • Doughnut charts
  • Funnel charts
  • Guage charts
  • Key influencers chart
  • KPIs
  • Line charts
  • Maps (ArcGIS, filled choropleth, and shape)
  • Pie charts
  • Ribbon charts
  • Treemaps
Figure 6:  Visualization options


And the list goes on...  

To start with a simple visualization, let's create a card with the total number of events (in this case it will correlate to the number of rows we have... each row in our data always contains a DeviceName). Begin by clicking anywhere in the canvas and then click the card icon under visualizations. Drag the field you want to count to the Fields box under Visualizations (in our case it was DeviceName). Then click the down arrow and select Count. There are lots of formatting options by clicking on the paint roller under visualizations. We encourage you to explore those settings to achieve your desired look. 

Figure 7:  Created our first visualization - a card that contains the count of events

Saving Your Work

Now that you have ingested, parsed, and created your first visualization in your report, it is time to save it. Click File > Save As > Name the file. Notice that the file extension is .pbix. Feel free to close Power BI Desktop, re-open your file, and also notice that the data is still there. This indicates that the data is self-contained within the .pbix file -- keep this in mind where sharing your .pbix files with others.


Conclusion

In this article, we showed the different options for downloading and using Power BI. Specifically downloading Power BI Desktop and ingesting BlackBerry (Cylance) Protect Threat Data Report data which is in CSV format. Power BI Desktop parsed the data and we showed one potential alteration to the way the data was interpreted. Lastly, we rounded out the article showing how to create your first visualization and save your report locally as a .pbix file. Our follow-on articles will cover more advanced visualizations, relationships, filters, using reports, and uploading reports to Power BI Service (online). Thanks for reading, we hope you enjoyed this introduction to Microsoft's Power BI. Please feel free to leave feedback and your favorite Power BI features in the comments section below.

Tuesday, September 1, 2020

Testing Logstash Data Ingest

 By Tony Lee

When setting up an Elasticsearch Logstash Kibana (ELK) stack, you may discover the need to test your Logstash filters. There are probably many ways to accomplish this task, but we will cover some techniques and potential pitfalls in this article. 

Note: This article can be used to simulate any syslog replay for manual data ingest testing, but our example specifically covers Logstash as the receiver and a fabricated, but realistic, event.

Real-time Visibility

The first thing that will aid us in this ingest testing is real-time feedback. This could come from at least two potential places:

1)  "Real-time" discovery search within the Kibana UI

This is accomplished by using the discovery dashboard, setting a search term you know to be in your test message, and a short refresh rate as shown in the screenshot below.

Figure 1:  Kibana "Real-time" search using refresh interval


2) Monitoring the logstash log file for warnings or errors

For this, we will use the tail command with the follow option (-f) to watch it in real-time.  The location of your Logstash log may differ so adjust the path as necessary.

tail -f /var/log/logstash/logstash-plain.log

We are looking for clues that may have prevented proper ingest such as:

[2020-08-31T20:25:21,997][WARN ][logstash.filters.mutate][main][b14e-snip-a422] Exception caught while applying mutate filter {:exception=>"Could not set field 'name' on object 'compute-this' to value 'see-this'.This is probably due to trying to set a field like [foo][bar] = someValuewhen [foo] is not either a map or a string"}


Encrypted or Unencrypted Listener

Once we have real-time visibility in place, we need to determine if the listening port is expecting encrypted data since this will determine how we replay traffic to it. There are a few ways to determine this:

1) Check the logstash filter config file

The example below, is an example of an encrypted port.  We can see that is the case because we are defining the SSL information and have ssl_enable set to true.

input {

  tcp {

    port => 6514

    ssl_enable => true

    ssl_cert => "/etc/logstash/logstash.crt"

    ssl_key => "/etc/logstash/logstash.key"

    ssl_verify => false

  }

}


2) Check the logstash logs for SSL errors

If you have an encrypted listener and you send data using an unencrypted transport method (such as telnet), you will see SSL errors such as the following:

[2020-09-01T13:46:20,758][ERROR][logstash.inputs.tcp][main][359d9-snip-c5038d] Error in Netty pipeline: io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER


3) Use a tool (such as openssl) to verify the SSL connection

Below is an example of checking the SSL connection using openssl, but other tools can be used.

openssl s_client -connect <logstash_host>:6514

If the listener is expected encrypted data, you will see details such as certificate subject, issuer, cipher suites and more.

--snip--

subject=C = AU, ST = Some-State, O = Internet Widgits Pty Ltd

issuer=C = AU, ST = Some-State, O = Internet Widgits Pty Ltd

--snip--

SSL handshake has read 1392 bytes and written 437 bytes

Verification error: self signed certificate

--snip--


Methods of Replay

Now that we have real-time visibility and we know if the listener is expecting encrypted data or not, we can look at different techniques to replay the traffic. We will start with unencrypted methods first because we can later tunnel the unencrypted data to an encrypted listener. We will also examine replaying an entire packet (including the header) vs. replaying just the data and having the header added.

1) Unencrypted replay of an exact packet (Included specified time and no addition of a header)

If you have example logs that you can expect that include the date/time format, you can replay the exact message using netcat/ncat.  Keep in mind that you are most likely sending a static time, so you will need set your Kibana time range appropriately.

First, place the contents of your event in a test file, we created a file called testevent.txt with the following contents (notice the included date and time):

<46>1 2020-08-31T02:08:08.988000Z sysloghost ExtraField - - [Location] Event Type: OurTest, Event Name: Blocked, Device Name: DESKTOPTESTLOGGER2, File Path: C:\Windows\system32\foo.ps1,  Interpreter: Powershell, Interpreter Version: 10.0.18362.1 (WinBuild.160101.0800), User Name: SYSTEM, Device Id: 4eaf3350a984


Second, use netcat or ncat to send the data to your listening port. The example shown below is sending to an unencrypted listener (be sure to replace logstash_host with the IP or hostname of your logstash server):

ncat <logstash_host> 6514 < testevent.txt

Then just monitor the logstash log and real-time search in Kibana to see the event and/or potential errors.

2) Encrypted replay of an exact packet (Included specified time and no addition of a header)

For this we will use the same testevent.txt file from above and nearly the same command, but we will add --ssl to force ncat to perform the encrypted handshake.

ncat --ssl <logstash_host> 6514 < testevent.txt


3) Unencrypted replay of an exact packet contents with an updated time

If you have packet contents, but want the header updated with the current time, you might be able to use the logger command in Linux.  The trick here is to get logger to reproduce your expected header. Use the following steps to attempt this:

Understand the logger options:

logger --help


Read in our test event and output to stderr for troubleshooting:

logger -s -f testevent.txt


Use logger options to alter the header (in our case, it was --rfc5424=notq) to match what we need and then create a new file with only the content and no header.  Ex:  testevent_noheader.txt

Figure 2:  Reproducing the event with an updated header

Send the event to the unencrypted listener and check for it in Kibana and Logstash logs:

logger --rfc5424=notq -s -f testevent_noheader.txt --server <logstash_host> --tcp --port 6515


4) Encrypted replay of an exact packet contents with an updated time

Unfortunately our version of logger does not have an option to enable encryption. So, if you were able to get logger to reproduce the header + content in the step above, but need to send it to an encrypted listener, you could once again use ncat to assist. The following command creates an unencrypted listener on your local host on port 6515--then anything written to that local port will be sent on in an encrypted state to port 6514.

Figure 3:  ncat listener to send data onto Logstash using SSL


Step 1) Create the listening wrapper:

ncat -l 6515 -c "ncat --ssl <logstash_host> 6514"


Step 2)  Send the packet to the wrapper using logger:

logger --rfc5424=notq -s -f testevent_noheader.txt --server localhost --tcp --port 6515


Conclusion

We are just scratching the surface in ways to test data ingest components such as Logstash. For instance, this could be expanded to include scripting with variables that are replaced with random data to generate more robust values. But that will be an exercise left to the user (or maybe a future article). We do hope this article proved useful and would love to know what you use for testing data ingest. Feel free to leave comments in the section below.  Thanks for reading.