Thursday, October 15, 2020

VirusTotal Lookups in Microsoft Power BI - Processing JSON

 By Tony Lee

So we are on this Power BI binge... and figured a VirusTotal lookup report and dashboard would be useful as a stand alone tool as well as a drill through fed by other reports. Presented below are the results of our efforts, steps we took, and limitations we encountered (as newbz). In typical form, we released the work for free so everyone can use, contribute, and modify as they see fit.  Enjoy!:

https://github.com/TonyLeeVT/VirusTotal-PowerBI-Lookup/

Figure 1:  Resulting VT Lookup report in Power BI

Features

Let's start with what this report provides -- from top left to bottom right:

  • Slicer (filter) to filter on engines that hit or missed the sample
  • Permalink to the results
  • Gauge to indicate number of hits vs engines
  • Verbose message from VT
  • What Resource we actually queried
  • MD5, SHA1, and SHA256 hash values
  • Table of engines, conviction results, malware classification, engine update, and engine version
  • Treemap of conviction results - useful for determining the highest malware family consensus

Process

At a high-level, here is what we did to create the dashboard:

  1. Created a parameter that holds the Resource we want to look up
  2. Used "Get Data" to query the VT API to pull in the JSON response
  3. Used Power Query Editor to manipulate and extract the nested JSON as needed
  4. Built a cool dashboard in Power BI Desktop
Since there are some intricate details in processing JSON data, we figured we would break down the steps in a bit more detail.

Creating the Parameter


We created two parameters to prompt users for data.  The first is the VT API key and the second is the hash they want to query.  Parameters are created in Power Query Editor by clicking on the down arrow for Manager Parameters > New Parameter

Name the field, Add a description, select Text for the Type, and enter a hash you know VT has for the Current Value.  In our example, we used the hash:  0a52fdc2219346d25295c8d6838122bcdd7ddf119386cd1030458b035cd97999


Figure 2:  Creating and managing the parameter that stores the query value

Note:  We also created a parameter for the API key so the user is prompted for the key when opening the Power BI template for the first time. When this is saved as a .pbix, the parameter value is also saved making it easy to use by the team.

Creating the VirusTotal Web API call for Get Data

To create the Web API call perform the following:
  • Get Data > Web
  • Advanced
  • URL parts:
    • https://www.virustotal.com/vtapi/v2/file/report?
    • apikey=
    • Select Parameter from the dropdown and use API_Key
      • Note:  The free VT API key can be used
    • &resource=
    • Select Parameter from dropdown and use "Resource"

Figure 3:  Creating API call to VirusTotal to get the JSON data

Note:  The data returned will be JSON.  Now we need to make it columnar.


Use Power Query Editor to Manipulate the JSON

When the JSON formatted data is returned from the VirusTotal API query, follow these steps to make it columnar:

Convert > Into Table

Figure 4:  Convert JSON to a table


Transform > Transpose > Use First Row as Headers

Figure 5:  Transposed the table and then used the first row as the header



Take note!  The field called scans has a value of: "Record".  When clicking on this object, it expands below revealing the results of each scan vendor. Notice that those values are also "Record" which means we have two nested JSON objects, one inside the other.  We will need to make both columnar data to use it in Power BI.

Figure 6:  The scans field is a JSON object that contains more JSON objects

Expand the scans column

Here is a trick that we learned from a Guy in a Cube video -- thanks for the videos fellas! There is a little double arrow box in the far right portion of the header for scans. If you click this, it will allow you to expand (convert) the JSON to columnar data once again.  Be sure to uncheck the box at the bottom to avoid prepending a prefix that we would strip away later.

Figure 7:  First expansion of JSON object called "scans"

Highlight the antivirus vendor columns > Unpivot Columns

Figure 8:  Unpivoting the antivirus vendor columns to make each entry its own row


Rename Attribute field to Engine

Expand the Value field that contains more nested JSON

Figure 9:  Renaming column header and expanding second JSON object


Rename new columns to remove "Value." in front of detected, version, result, and update.

Building the dashboard

The dashboard consists of a filter, some tables, a gauge, and a treemap. Please see our other Power BI articles on building visualizations. The toughest portion of this task was taking JSON input and making it columnar so it can be processed by Power BI. As a bonus, we are also templating this project and making it available to everyone.

Template

As we just mentioned, we saved this work as a template and will distribute it for free. When opening the template, simply insert your VT API Key (which can be the free API key) and the hash that you want to query (called a resource).

Figure 10:  Templated VirusTotal lookup report in Power BI


Limitations

Currently we believe we have the following limitations, but these may be easy to overcome depending on how many ideas and solutions we can pull from our readers:
  1. Unable to modify hash parameter from report / dashboard to refresh data feed with new data natively from Power BI
  2. Drill through lookup using hash from previous report may be limited due to parameter update issue
  3. Potential for AV column rename to fail due to hard coded AV vendor names in power query
  4. Error handling could be improved for instances where VirusTotal does not have the data
    • "response_code": 0 vs. "response_code": 1

Conclusion

We hope you enjoyed this article and have some ideas for us to correct some of the limitations. Feel free to contribute ideas and help us make this report better by using the comments feature below or submitting pull requests via Github. Or just give us a shout out if you are using this solution.  Enjoy!

Thursday, October 8, 2020

Fun with Microsoft Power BI - Part IV - Improving Workflow

  By Tony Lee

Welcome to part IV in this series of going from zero to hero using Power BI to ingest, process, and make amazing reports.  If you have read some of our other articles you can probably tell by now that we enjoy making data actionable. Honestly, it doesn't matter what type of data or even where the data ends up. As long as we can make informed decisions using the data -- we love it. Following in this theme we are going to make BlackBerry (formerly known as Cylance) Protect Threat Data Report (TDR) CSVs actionable using Power BI and Power BI Desktop. You can use any data source to follow along in this series, but our example BlackBerry Protect report is shown below which we are happily sharing the Power BI template (.pbit) at the github link below so you can load and analyze your own data!

https://github.com/TonyLeeVT/BlackBerryProtectTDR-PowerBIViewer


Figure 1:  Our Policy Explorer Power BI report using BlackBerry Protect TDR data

In the first article, we covered:

  • Getting Started
  • Data Ingest
  • Adjusting Fields
  • Visualizations
  • Saving Your Work
In the second article, we covered:
  • Tabs
  • More Visualizations
    • Text box
    • Slicer
    • Table
    • Pie Charts
    • Treemap
  • Using Reports and Dashboards
  • Uploading Reports to Power BI Service (Online)
In the third article, we covered:
  • Question and Answer (Q&A) Feature
  • Power BI for Mobile
  • Changing the Data Source
  • Scheduling Data Refresh
In this article, we will cover:
  • Matrix and Decomposition Visualizations
  • Inserting Images with Actions
  • Tooltips
  • Parameters
  • Power BI Templates

Matrix and Decomposition Visualizations

We just wanted to give a shout out to two very cool visualizations that help with analyst workflow. The first being the matrix visualization which allows you to hierarchically dig into data by expanding rows or columns.  To create the Matrix, we used Policy Name, Section, and Key as the Row value + Last Value as the Values. 

Figure 2:  Interactive matrix of policy configurations


The second visualization is Decomposition which allows you to visually analyze large amounts of data by viewing and summing common values.  To create the Decomposition, we use the distinct count of Policy Name for the Analyze field and Explaining by the Section, Key, and Value.

Figure 3:  Interactive decomposition visualization of policy configurations

Both of these visualizations ended up being perfect for our Policy Explorer Report shown as the first image in this article. Note: This test environment has more policies than what is typically found in large and complex environments, but it further proves that this solution scales.

Inserting Images with Actions

This feature may seem like a no-brainer, but it can really spice up a report. You can add images/logos and assign actions to them by going to Insert > Image > selecting the image. With the image selected, enable action and select type. In the example shown below, we chose a type of Web URL and added a link back to the BlackBerry website.

Figure 4:  Inserting the BlackBerry logo and creating a hyperlink back to the product page


Tooltips

The best way to describe a tooltip is the ability to add additional context to data via a simple action of hovering over an item in a report. In our screenshot below, we used an example of getting host information context by hovering over an event. This prevents us from having to dig into a completely separate table of devices data to discover information about the host in that event. The information displayed in the popup is quite powerful since it includes:  Operating system, Agent version, MAC address, Policy name, IP address, and more.

To enable this, we performed the following high-level steps:
  • Created a new page via the tabs at the bottom
  • Hid that page from view
  • Changed the page size and type to Tooltip
  • Created the multi-row card with our host data
  • Set the Tooltip field to be Serial Number (which is common to Devices and Events)

Figure 5: Powerful tooltip created by using data from related data feed

Parameters

In the process of getting ready to convert our work to a Power BI template to share with the community, we converted our BlackBerry Protect Threat Data Report data sources to use a parameter for the TDR token. This allows the user to change a single field (ex: TDR Token) and pull data from a different tenant. As a bonus, when a user opens our Power BI template, Power BI prompts the user to enter their TDR token to pull down data from the BlackBerry TDR API and populate the reports. Then when the report is scheduled for updates (as covered in part III of our series), all of the required information is already included to access the data and update the reports.

Figure 6:  Token parameter used to make these data feeds easily configurable for each customer's token

Power BI Templates

This is the topic most have probably been waiting for because it means that we are going to share the BlackBerry Protect report we have been building to explore Power BI.

Creating a template (.pbit) is super easy in Power BI Desktop.  Simply perform the following:

File > Export > Power BI template

Now, when you download our Power BI Template, you simply enter the TDR token from your BlackBerry protect console (Settings > Application > Threat Data Report) and it will start downloading the information.

Figure 7:  Entering the TDR parameter after opening the Power BI Template

Conclusion

This brings us to the end of our series on Fun with Microsoft Power BI (using our BlackBerry Protect example). We hope you enjoyed the topic and learned something new. Please feel free to leave any pro-tips or questions in the comment section below. As a bonus, if you are a BlackBerry Protect customer, please feel free to download our template, enter your TDR token and check out the visualizations. Let us know what you think--thanks for reading!


Monday, October 5, 2020

Fun with Microsoft Power BI - Part III - Advanced Ingest and Viewing

 By Tony Lee

Welcome to part III in this series of going from zero to hero using Power BI to ingest, process, and make amazing reports. If you have read some of our other articles you can probably tell by now that we enjoy making data actionable. Honestly, it doesn't matter what type of data or even where the data ends up. As long as we can make informed decisions using the data -- we love it. Following in this theme we are going to make BlackBerry (formerly known as Cylance) Protect Threat Data Report (TDR) CSVs actionable using Power BI and Power BI Desktop. You can use any data source to follow along in this series, but our example BlackBerry Protect report is shown below which we will happily share the Power BI file at the end of the series for you to load and analyze your own data, so stay tuned for that!

Figure 1:  Our Power BI report using BlackBerry Protect TDR data

In the first article, we covered:

  • Getting Started
  • Data Ingest
  • Adjusting Fields
  • Visualizations
  • Saving Your Work
In the second article, we covered:
  • Tabs
  • More Visualizations
    • Text box
    • Slicer
    • Table
    • Pie Charts
    • Treemap
  • Using Reports and Dashboards
  • Uploading Reports to Power BI Service (Online)
In this article, we will cover:
  • Question and Answer (Q&A) Feature
  • Power BI for Mobile
  • Changing the Data Source
  • Scheduling Data Refresh

Question and Answer (Q&A) Feature

Ever wish you could use natural language to ask questions and get answers about your data? In Power BI, you can do just that and even save the resulting visualizations into your report (if you have the necessary permissions). It may take a bit of trial and error, but the results can be both interesting and useful. In Power BI Desktop, this is activated by going to Insert menu > Q&A. In the Power BI Service online interface, click on the report you want and then the "Ask a question" button.

Figure 2: Question and Answer feature of Power BI

Power BI is nice enough to suggest a few questions to ask, however we will list some of our questions below as well.

Example searches include:
  • most recent threats data report
  • count of threats data reports by create time
  • show classification by sha 256, file name as matrix
Figure 3:  Example Q&A



Power BI for Mobile

One of the best features of Power BI is the platform flexibility. Once you build a report and have it available in your workspace, you can view it on nearly any mobile device: Apple, Android, and Microsoft.


We have an Android phone that we used to view the report and try out the filters. The app is available in the play store: https://play.google.com/store/apps/details?id=com.microsoft.powerbim&hl=en_US

 
Figure 4:  Screenshot of our report displayed on a Galaxy Note 8

Changing the Data Source

In our first article, we wanted to keep things simple by uploading a static CSV file. This is great for building the initial report, but it becomes cumbersome to update the data. Fortunately for us, this CSV data came from a simple to use web API which Power BI can easily handle. In fact, Power BI supports a ton of data ingest options such as files, databases, cloud services (over 60 of them!), web APIs, and more. Now let's change our data source from static CSV file to web API, but first let's check the timestamp on the imported file. This can be achieved by hovering over the ThreatsDataReport field.

Figure 5:  Determining staleness and freshness of data


There is an option to refresh the data, but because the original data source is a static CSV, we would need to manually download it again. To change the data source, we need to Edit the query.

Figure 6:  Edit the query to change the source from static file to web query

This opens Power Query Editor. Next we click, Data source settings > Change Source... > Advanced > Then we use the file path parts to first enter the URL and then enter the token on the second line. This file will open the exact same as the static file.

Figure 7:  Changing the data source to using web API

Now that the source is changed to use the web API, we can refresh the data with a click of the button in Power BI using "Refresh data". After checking the timestamp on the new data, the update was successful!

Figure 8:  Refreshing data via the web API


Scheduling Data Refresh

Taking reporting one step beyond a manual refresh, we can also schedule a refresh of the data. In Power BI Service, expand the workspaces to see the Datasets. Click on the three dots next to the Web API data feed and select "Schedule refresh".

Figure 9:  Scheduling a refresh

Under the Schedule refresh menu, turn it on, enter the frequency and even the time to refresh. Another nice touch in this feature is email notification of failure.

Figure 10:  Settings for scheduling a daily data refresh


Conclusion

In this article, we looked at some interesting and more advanced features in Power BI to include the natural language Q&A feature, viewing reports on the Power BI mobile app, converting our static CSV file to update via web API, and finally scheduling the web API dataset to automatically update every day. Power BI is full of features and hopefully you learned something new today. Please feel free to leave feedback and your favorite Power BI features in the comments section below.


Thursday, October 1, 2020

Fun with Microsoft Power BI - Part II - Reporting

By Tony Lee

Welcome to part II in this series of going from zero to hero using Power BI to ingest, process, and make amazing reports.  If you have read some of our other articles you can probably tell by now that we enjoy making data actionable. Honestly, it doesn't matter what type of data or even where the data ends up. As long as we can make informed decisions using the data -- we love it. Following in this theme we are going to make BlackBerry (formerly known as Cylance) Protect Threat Data Report (TDR) CSVs actionable using Power BI and Power BI Desktop. You can use any data source to follow along in this series, but our example BlackBerry Protect report is shown below which we will happily share the Power BI file at the end of the series for you to load and analyze your own data, so stay tuned for that!

Figure 1:  Our Power BI report using BlackBerry Protect TDR data

In the first article, we covered:

  • Getting Started
  • Data Ingest
  • Adjusting Fields
  • Visualizations
  • Saving Your Work
In this article, we will cover:
  • Tabs
  • More Visualizations
    • Text box
    • Slicer
    • Table
    • Pie Charts
    • Treemap
  • Using Reports and Dashboards
  • Uploading Reports to Power BI Service (Online)

Tabs

Just a quick note for folks that are familiar with excel tabs. In Power BI, they work much the same. Since the canvas size is somewhat fixed in size, tabs might come in handy for building multiple reports and overcoming canvas limitations.

Figure 2: Shows boundaries and tabs


More Visualizations

In the first article, we showed you how to add an event count, which essentially counted the number of rows in our BlackBerry Protect Threat Data Report. We started off with an easy visualization and will progress to more difficult ones.

Text box

We are fans of being able to quickly glance at a dashboard and know what it is...  thus a simple text box at the top is a great way to accomplish this task.  Insert > Text Box.  Then get fancy fancy with the background color and font. We will get even fancier in a later article using images with hyperlinks, but let's crawl, walk, then fly.


Slicer

We are also fans of being able to interact with the data by carving and drilling down into interesting events. For this, we use a slicer (Power BI term), also known as a filter to the rest of the world. ;-) In our BlackBerry Protect example, we are filtering on multiple fields using a checkbox in most cases. To do this, simply click in the canvas, click on the slicer icon, drag the field you want to filter on, and bob's your uncle. 

Figure 3:  Check box filters (aka slicers)

You may have also noticed that we have a time range selector in our report. This is also a slicer, but we dragged a time field into the field box and Power BI is smart enough to convert it from a check box to a time range selector. For our example, we used the "Create Time" field for the malware. Note that there are additional options that specify how the time range selector behaves. Those are configurable by using the down arrow. Our preference for this scenario is "Relative Date" which allows you to specify a relative time range such as:  Last, Next, or This: X days, weeks, months, and years.

Figure 4:  Configuring how the time range selector works


Table

Tables are fairly straight forward to use as well. To use, simply click on the canvas > click the table icon > Drag the fields you want in the table.  The down arrow allows you to correct the order of appearance if necessary by moving the field up and down the list or you can drag the box up or down in the list.


Doughnut and Pie Charts

We saved doughnut and pie charts to later in this article because it is not as straight forward to create it depending on your data set. We used a doughnut chart in our example for "File Status" and "Detected By". The problem with these fields is that the data source does not provide the statistical breakdown (percentage per file status or what detected the malware). That must be calculated and Power BI may or may not do that automagically. For this, it helps to know a little bit of DAX to create measures. Per Microsoft:

"Data Analysis Expressions (DAX) is a library of functions and operators that can be combined to build formulas and expressions in Power BI, Analysis Services, and Power Pivot in Excel data models."

"Measures are dynamic calculation formulas where the results change depending on context. Measures are used in reporting that support combining and filtering model data by using multiple attributes such as a Power BI report or Excel PivotTable or PivotChart. Measures are created by using the DAX formula bar in the model designer."


These formulas are heavily dependent upon functions and there are a lot of them. A good reference is found here: https://docs.microsoft.com/en-us/dax/dax-function-reference

As an example... say you had a column called "Sales Amount" and you wanted to sum that and store it in a measure called "Total Sales".  The DAX for that would look like the following:

Total Sales = SUM([Sales Amount])

Now that we have a bit more understanding, we will create the "File Status" doughnut chart. To do this, we went to the Modeling menu > New measure > and enter:

File Status Stats = DIVIDE(count('ThreatsDataReport'[File Status]),CALCULATE(count('ThreatsDataReport'[File Status]),ALLSELECTED(('ThreatsDataReport'))))

Note:
  • ThreatsDataReport is our table, the entries in the brackets [] are the columns
  • ALLSELECTED is so our slicers (filters) still work when selected
Once we click the checkmark, we should have a new field called "File Status Stats".  We will click the canvas, click the pie or doughnut chart icon, and drag the "File Status Stats" field to the Values box in visualization and we will drag "File Status" to the Legend box.

Figure 5:  Creating a measure and a doughnut chart in Power BI

Doughnut and Pie Charts - Cont.

Note: After some trial and error, we discovered a potentially easier way to create doughnuts and pie charts showing top values. Select either pie or doughnut chart and drag the field into the Legend box and drag the same field into the Values box and make sure Count is selected for Values. This creates the same pie and doughnut chart as our DAX fanciness above.

Pro-tip:  You can now change this visualization to a table to show an easy to read statistical breakdown (Top or Rare).

Figure 6:  Create a pie or doughnut chart while avoiding DAX

Treemap

Fortunately, creating a treemap is similar to creating a doughnut or pie chart. Once you have the DAX (or logic) created, it is all downhill from there. In our example, the malware classification data made an awesome treemap visualization. For this treemap, you can either drag the same field twice (with Value as a Count) as we showed in the example above, or you can use DAX for the calculation.

Our DAX to calculate the classification stats measure:

Classification Stats = DIVIDE(count('ThreatsDataReport'[Classification]),CALCULATE(count('ThreatsDataReport'[Classification]),ALLSELECTED(('ThreatsDataReport'))))

To create the treemap, click the canvas, then the treemap icon, drag the "Classification" field to the Group box, and the "Classification Stats" field to the Values box.

Figure 7:  Creating a treemap using the malware classifications

Note: Most visualizations apply a filter when clicked. Not all of them are very intuitive in how to clear the filter. For the treemap, either click the box you already clicked or click the top left box to clear the filter.

Using Reports and Dashboards

Now that you have a report built, you can use it as-is in Power BI Desktop. The slicers function to carve up the data as needed, however there is no presentation view or mode. Thus your presentation to others will not look as polished as it could. To get around this problem, we can take the .pbix file and upload it (or publish it) to a personal workspace in Power BI Service where you can use it as individual reports or pin visualizations from multiple reports to make a dashboard.

Uploading Reports to Power BI Service (Online)

Upfront warning:  Be careful publishing the work since the .pbix contains your data set as well and could potentially leak information.

Keep in mind that there are multiple methods to load a local Power BI Desktop file into the online service. You can use the Publish button in Power BI Desktop or you can log in to your Power BI account:  https://powerbi.microsoft.com/en-us/landing/signin/
Click "My Workspace" > "Get Data" > "Local File" > navigate to your .pbix file

Figure 8:  Uploaded .pbix to Power BI Service workspace


This should now be in your personal workspace and you can view the report via your web browser. To enter a full screen presentation mode, click the double arrow in the top right corner of the window.

Figure 9:  Entering full screen presentation mode


Conclusion

In part II of our series we covered how to use tabs and many more visualizations--including some that are not so straight forward. We only scratched the surface of DAX since that could be a blog series unto itself. But we also discovered the presentation limitation in Power BI Desktop and how to over come that presentation by using the Power BI Service. We hope that you are now able to take data, ingest it, create a report, and present that report in a professional setting. Our future articles will focus on improving the usefulness and functionality of carving and displaying data using Power BI. Please feel free to leave feedback and your favorite Power BI features in the comments section below. Stay tuned for more great Power BI info!