Page tree
Skip to end of metadata
Go to start of metadata

Welcome to the tutorial on how to combine Power BI Desktop with Health Discovery to turn text mining results into meaningful dashboards.


  1. The tutorial assumes that you already have Health Discovery and Power BI Desktop installed.
  2. Furthermore - in this tutorial, we will connect Power BI Desktop to the Search API of Health Discovery. For that, you will need a project in Health Discovery in which you have set up a search. We will work with a project 'i2b2' in which a search is set up with data from the I2B2 corpus.
  3. To use the Health Discovery Search API, you need a valid API token. If you do not have one, contact your Health Discovery Administrator.

Connect to data

With Power BI Desktop installed, you're ready to connect to Health Discovery. To see the many types of data sources available, select Get Data > More in the Power BI Desktop Home tab, and in the Get Data window, scroll through the list of All data sources. In this tutorial, we connect to the Search API of Health Discovery by using the Web data source.

In the From Web dialog box, switch to Advanced and paste the address of your Search API (in this tutorial,*%3A*&rows=2147483647) into the URL field. Under HTTP request header parameters, enter the parameter "api-token" and the API token you want to use. Finally, select OK

The query q=*:* search for all documents in the search index. The parameter rows=2147483647 (=the highest possible integer number) makes sure that all documents of the index are returned.

If the search index is big, you could transferring a couple of gigabytes in that query. 

Now the Power Query Editor should open:

Click on the "Record" link to the right of payload and repeat the same for solrResponse and response. The following table should appear.

Click on List to the right of docs to obtain a list of Records.

Shape Data

Now that you're connected to a data source, you can shape the data to meet your needs. Shaping doesn't affect the original data source, only this particular view of the data.

As a first step, click on the To Table button in the upper left.

Then, click on the little Expand icon in Column1:

You can select as many columns as you want. In this tutorial, we only select the columns document_name and preferredTerm_ICD10CM_2019 to select the document name and the diagnoses found in this document:

The table should expand and now look like this:

By double-clicking on the column names, you can rename them into Document Name and Diagnoses. Then click on the little Expand icon in Column Diagnoses followed by clicking on Expand to New Rows:

This makes the diagnoses readable and creates one line per diagnosis (1), i.e. multiple lines per document (2):

Now let's do the same thing for drugs: First, rename the Query (on the left side) to Diagnoses, then Duplicate this query and rename the duplicated query to Drugs.

On the duplicated query Drugs, remove the bottom three Applied Steps that you see on the right side of the page (from bottom to top):

Expand Column1 and select document_name and preferredTerm_RxNorm_20200809:

Then, rename the two columns to Document Name and Drugs, and expand column Drugs by clicking on the little Expand icon next to Drugs and select Expand New Rows. The result looks like this:

For now, we finish Shape Data activities and click on Close & Apply in the upper left:

Model Data

Now the Power Query Editor should close and the Power BI Desktop window should be visible again. Here, click on the Model icon on the left.

Via drag and drop you can connect the Document Name of Diagnoses with the Document Name of Drugs. The following popup window should appear. Close this window with Ok.

Build reports

By clicking on the Report icon on the left, switch to the Report view where you can build visualizations and reports. The Report view has six main areas:

  1. The ribbon at the top, which displays common tasks associated with reports and visualizations.
  2. The canvas area in the middle, where visualizations are created and arranged.
  3. The pages tab area at the bottom, which lets you select or add report pages.
  4. The Filters pane, where you can filter data visualizations.
  5. The Visualizations pane, where you can add, change, or customize visualizations, and apply drill-through.
  6. The Fields pane, which shows the available fields in your queries. You can drag these fields onto the canvas, the Filters pane, or the Visualizations pane to create or modify visualizations.

You can expand and collapse the Filters, Visualizations, and Fields panes by selecting the arrows at the tops of the panes. Collapsing the panes provides more space on the canvas to build cool visualizations.

To create a simple visualization, just select any field in the fields list, or drag the field from the Fields list onto the canvas. Let's drag the Diagnoses field from Diagnoses table and the Drugs field from the Drugs table onto the canvas, and see what happens.

Now, select the Drug table in the Canvas and convert it to a Clustered Column Chart under Visualizations (1). Drag the Drugs field in the Drugs table into the Values field in the Visualizations pane (2). Enlarge the Drug chart in the Canvas and position it to the lower right corner (3).

Repeat these 3 steps for Diagnoses. If the diagnoses are sorted alphabetically (like in the screenshot below) and you want to sort them by frequency, use the menu at the top right of the visualization and click Sort By, Count of Diagnoses, and Sort descending.

We now want to calculate the average number of diagnoses per document. For that, we click on New measure in the ribbon at the top.

We then enter the following formula into the formula editor (1) and drag the measure from the Fields pane into the Canvas (2).

Avg Number of Diagnoes per Doc = COUNT(Diagnoses[Diagnoses])/DISTINCTCOUNT(Diagnoses[Document Name])

Measures in Power BI are used for defining powerful calculations. This is done using DAX Language (Data Analysis Expression). DAX syntax goes beyond the scope of this tutorial. For more information , see the full Power BI documentation

Change the size and position of this measure in the Canvas (1) and set the visualization to a Card.

Finally, add a logo by clicking Insert (1) and Image (2) and position the logo in the upper right of the Canvas (3).

As we linked the field document name of both tables Drugs and Diagnoses with each other in a previous step, selecting diagnosis values in the left chart (Multiselect is possible by pressing SHIFT + Mouse clicks) (4) also drills down the drug chart on the right (5).

Congrats! You're now ready to create powerful charts using Health Discovery and Power BI Desktop.

  • No labels