Show Text Attributes in Tooltips

PowerBI allows users to add measures to tooltips. Many users want to add text columns as well. Today if you drag a text column to the tooltips, a count measure is created which is not very useful. Before PowerBI enhances the tooltip feature by allowing users to drag text columns to tooltips to show the text contents, you can make it happen today by creating DAX measures. Let’s say you want to add ‘Table'[TextColumn] to tooltips. Create a measure as below.

[TextColumn Tooltip] = TOPN(1, VALUES(‘Table'[TextColumn]), [TextColumn], ASC)

Now you can drag the newly created measure to tooltips. I have created a sample model with such a tooltip based on AdventureWorksDW database. As you can see in the picture below, when users hover over an individual product, the tooltips include the [Description] column for that product.

productdescriptiontooltip

 

Show Text Attributes in Tooltips

Connect any number of tables together via a common column

One of the key powers of Power BI is the ability to bring data from different sources into a single model and then join them together by creating relationships among tables to perform analysis and create reports across all the data in the unified model. But a lot of people are confused when they run into one of the errors below while trying to create a relationship between two tables inside Power BI Desktop:

  • We cannot create a relationship between ‘Table A'[Column X] and ‘Table B'[Column Y]. This could be because there is missing intermediate data to connect the two columns.
  • You can’t create a relationship between these two columns because one of the columns must have unique values.
  • You can’t create a relationship between these two columns because the ‘Column X’ column in the ‘Table A’ table contains null values.

Those who are familiar with relational database theories can appreciate the sound principles underlying those error messages. But many users of Power BI may be at a loss of what to do next. While Power BI Desktop can make creating relationships easier in the future by helping users resolve those errors automatically, there are things you can do today to connect the two tables together. The same methodology can be generalized to connect any number of tables together through a common column.

Assume a user has imported five different tables containing various information and transactions about products, all of which identify each product by a [Product] column, but the user is unable to create relationships among the tables as he or she is blocked by the errors listed above. The user wants to connect all five tables together via the [Product] column as illustrated by the conceptual schema in Figure 1.

 

Figure1
Figure 1.

To do so we need to use DAX to create a calculated table in the middle so that the five original tables can each create a relationship between itself and the calculated table. I have created a mock model in this pbix file to demonstrate the technique.

Create a calculated table by clicking the New Table button as shown in Figure 2.

ClickNewTable
Figure 2.

The DAX expression is

All Products =
FILTER(
DISTINCT(
UNION(
DISTINCT(Table1[Product]),
DISTINCT(Table2[Product]),
DISTINCT(Table3[Product]),
DISTINCT(Table4[Product]),
DISTINCT(Table5[Product])
)
),
NOT(ISBLANK([Product]))
)

This DAX formula takes the unique values from the [Product] column of each of the five tables, unions them together, removes duplicate values, and finally removes the BLANK value if there is one. You can use the same formula in any model by changing the number of tables and the names of tables and columns. Now you can create five relationships between each of the five original tables and the newly created calculated table. Figure 3 shows the final result in the Relationships view. You can see that every relationship allows filters to flow both ways as indicated by  bidi.

FiveTablesConnected
Figure 3.

Now that all fields in the model are connected, you can freely add any column from any table to your report, slice and dice to your heart’s content, things will just work as if by magic. Moreover, the calculated table in the middle will automatically recalculate each time you refresh data in the original tables.

 

Connect any number of tables together via a common column

Simple Filter in DAX Measures

Today’s post is about different ways to set a simple filter in DAX. Although there are plenty of writings covering these patterns, I still see DAX authors get confused on a regular basis. In this post, I am going to clarify the differences among the patterns. The examples are based on a pbix file created from the Adventure Works database that you can download from here. It helps if you have learned about filter context before but you should still follow the general idea outlined here without those knowledge.

Setting a basic filter is one of the first things people do when writing a DAX measure. Let’s say you want to write a measure to calculate the total sales on red products. The two common DAX patterns I saw people use are:

Pattern 1. Calculate(Sum([Sales Amount]), ‘Product'[Color] = “Red”)

Pattern 2. Calculate(Sum([Sales Amount]), Filter(‘Product’, [Color] = “Red”))

Both approaches work as expected in simple models and reports, but from time to time people run into troubles on such simple DAX measures and have difficulty figuring out the underlying reason.

Although Patterns 1 and 2 look very similar, they actually have different semantics in DAX. The differences become obvious when you add the [Color] column and both measures to a flat table chart as shown in Figure 1. While Pattern 1 returns sales of red products in all rows, Pattern 2 returns blank values in all rows except for the red color row and the grand total row.SimpleFilter_Figure1

Under the hood, Pattern 1 is simply a syntax shortcut for the following DAX expression:

Pattern 1.1. Calculate(Sum([Sales Amount]), Filter(All(‘Product'[Color]), [Color] = “Red”))

The key difference is the implied All function in Pattern 1. Because of the All function, Pattern 1 does nothing more than replacing any prior filter on the [Color] column by the new filter [Color] = “Red”, on the other hand, the lack of All function in Pattern 2 means the ‘Product’ table is first filtered by any existing filters from the filter context and then the remaining rows are further filtered by the predicate [Color] = “Red”. This explains why you see those blank values in Figure 1: Since the ‘Product’ table is first filtered down to the rows matching the color of the current row in the flat table and then further filtered by the predicate [Color] = “Red”, the net result is empty if a row has a color other than red; the grand total row has no filter coming from the flat table therefore only the predicate takes effect.

In addition to the unwanted results in Figure 1, Pattern 2 also suffers from an unintended performance hit even when it delivers correct results. For example, if a report has a page filter on ‘Date'[Calendar Year] = 2008, the ‘Product’ table will be filtered down to all products sold in 2008 before the predicate [Color] = “Red” is applied, that means DAX engine needs to scan the ‘Internet Sales’ fact table just to calculate the Filter function itself. Few people would want to scan a fact table only to set a filter on a dimension table.

The All function allows a column name or a table name as its argument. So the following pattern also achieves the same effect as Pattern 1 does.

Pattern 3. Calculate(Sum([Sales Amount]), Filter(All(‘Product’), [Color] = “Red”))

The difference between Pattern 3 and Pattern 1 is that Pattern 3 replaces filters on all columns from the ‘Product’ table in the filter context but Pattern 1 only replaces prior filter on the [Color] column. As a result, if you have several columns from the ‘Product’ table in the report, all of them will be removed from the filter context, the only remaining filter on the ‘Product’ table is [Color] = “Red”, as shown in Figure 2. As you can see in the picture, the values of measure [m – Sum Filter Pattern 3] is not affected by the filter [Class] = “L” from the flat table.

SimpleFilter_Figure2

Today we have discussed the differences among three ways of setting a simple filter in DAX. Although all three patterns may work in most cases and any one of them may be the right solution for a particular report, if you are not sure which one to use, I would start with Pattern 1, or its equivalent Pattern 1.1, due to its minimal side effect.

 

Simple Filter in DAX Measures

Handle Complex Relationships inside Power BI Desktop

Introduction

Modern self-service BI products like Power Pivot in Excel and Power BI Desktop are often used to mash up data from different sources. Once multiple tables are imported into the model, users create relationships among them so that they can easily slice and dice data using any column from related tables. For those users of Power BI Desktop who only create relationships in the Manage Relationships dialog box, Figure 1, or connect columns in the Diagram View, Figure 2, they often run into constraints which block them from creating the relationships they need. For example, users cannot create a relationship by joining more than one column from each table. In this blog post, I am going to show you a way to work around such constraints inside Power BI Desktop.

Figure 1.
Figure 1.
Figure 2.
Figure 2.

In Power BI Desktop, users can only create a relationship between a pair of tables by joining two columns, one from each table. Furthermore, at least one of the columns cannot contain any duplicate values or blank values; otherwise, you get errors seen in Figure 3 or Figure 4.

Figure 3.
Figure 3.
Figure 4.
Figure 4.

Users who are familiar with relational database technologies can recognize that relationships are modeled after foreign key to primary key constraints. In relational databases a primary key column uniquely identifies each record in the table, must contain unique values, and cannot contain NULL values.

If you have already imported tables into the model but cannot create relationships due to the above requirements, you are typically told to work around the constraints by

  • creating a calculated column to combine multiple join columns into a single one so you can create relationships between two calculated columns;
  • introducing a third table that contains unique and non-blank values of the join columns and then creating two relationships between each of the original tables and the third table so that the original tables are related indirectly.

The introduction of bi-directional cross-filtering and calculated tables in Power BI Desktop has made both techniques easier to implement than in Power Pivot. Today I want to show you another way of working around the constraints: sometimes you can join two tables inside Query Editor and then bring a single table with all the columns you need into the model.

The test data for the two examples I am going to show can be downloaded from here. You can also download the final pbix file after we have completed the two examples from here.

First Example

In our first example, we want to create a relationship between the two [ProjectID] columns as shown in Figure 5. [ProjectID] is unique in Table1 except sometimes it is blank when a program has been created without any project yet.

Figure 5.
Figure 5.

Assume we want to perform some cost analysis on each program. Although we cannot create a relationship between Table1 and Table2 due to the blank value, it’s sufficient if we can bring [ProgramID] into Table2. Here is how to achieve this in Power BI Desktop.

Once you have selected Table1 and Table2 in the Navigator, click the Edit button, as shown in Figure 6.

Figure 6.
Figure 6.

Select Table1 and then click the Properties button in the ribbon as shown in Figure 7.

Figure 7.
Figure 7.

Uncheck “Enable load to report” and then click OK, see Figure 8. We bring in Table1 just so we can merge its [ProgramID] column into Table2, afterwards, it has served its purpose therefore no longer needed when we build reports.

UncheckEnableLoadToReportTable1
Figure 8.

Select Table2 in Query Editor and then click the Merge Queries in the ribbon, see Figure 9.

MergeQueries
Figure 9.

In the Merge dialog box, click [ProjectID] under Table2, select Table1 in the dropdown box, click [ProjectID] under Table1, select “Full Outer (all rows from both)” as Join Kind, and then click OK. We have chosen “Full Outer” just in case we want to build reports including programs without any projects yet.

Figure 10.
Figure 10.

A new column has been added to Table2 where each row shows Table. The previous step only makes columns of Table1 available in Table2, but we still have to actually pick the individual columns we care about. Click the button at the upper right corner as shown in Figure 11.

Figure 11.
Figure 11.

Since we just want to bring [ProgramID] into Table2 with the same name, keep only ProgramID checked, uncheck the other two boxes and then click OK, see Figure 12.

Figure 12.
Figure 12.

Table2 now has the [ProgramID] column, we can close Query Editor, shown in Figure 13, and start building reports.

Figure 13.
Figure 13.

Second Example

As a second example, we examine Table3 and Table4 as shown in Figure 14. We want do some analysis on Table3 but we also need the [Principal Amount] column from Table4.

Figure 14.
Figure 14.

We cannot create a relationship directly since we must join by both the [Entity Code] column and the [Fiscal Month] column. Although we can create a calculated column on each table to combine [Entity Code] and [Fiscal Month], it’s an artificial and strange thing to do. Instead we will leverage Query Editor’s flexible ability to merge two tables. The steps are identical to those in the first example except that we pick two columns from each table in the Merge dialog box. You can select multiple columns from a single table by holding the Ctrl button and then clicking the columns you want. Note that the order of clicking is important as it determines which pairs of columns will be matched during the merge. The UI conveniently labels the columns with numbers in the order you click, as shown in Figure 15.

Figure 15.
Figure 15.

Conclusion

Today I described a way to work around limitations of relationships by merging tables in the Query Editor before importing them into a model. Strictly speaking, this technique is not the same as creating relationships but, if used correctly, often gives us what we need. Users who are familiar with Power Query Formulas know that they can join tables in more flexible and powerful ways than allowed by the UI. By packaging two powerful engines, one from Power Query and one from Power Pivot, into a single product, Power BI Desktop gives users maximum versatility to shape their data any way they want.

Handle Complex Relationships inside Power BI Desktop

Use Calculated Table to Figure out Monthly Subscriber Churn

Power BI Desktop 2015 SU09 introduced a really cool new feature called calculated tables. Those of you who are already familiar with calculated columns will share my excitement since we now have a flexible, intuitive, and visual way of solving common business problems which were very hard, if not impossible, to tackle previously. Today I am going to show you how to leverage calculated tables to display monthly total number of subscribers starting from a table listing each user account’s start and end dates. The sample data is taken from Kathi Kellenberger’s blog post T-SQL Window Functions Speed Phreakery. If you are a SQL junkie, you should read Kathi’s blog to learn a few clever ways of writing ever faster T-SQL queries to calculate monthly total subscriber numbers from the original data. A sample of the source data is shown in Figure 1.

Registration_ID FirstName LastName DateJoined DateLeft
15 Norma Lynch 1/4/2004 8/20/2008
19 Bobbie Brennan 1/5/2004 9/20/2006
22 Terra Cabrera 1/5/2004 4/14/2007
25 Alicia Navarro 1/6/2004 12/3/2007
55 Lauren Everett 1/10/2004 12/5/2005

Figure 1.

Our goal is to find out the total number of subscribers at the end of each month taking into account the total number of subscribers joined and left from the beginning of time to that month. Figure 2 shows the desired result in the first ten months of 2004.

Month PeopleJoined PeopleLeft Subscribers
1/31/2004 167 0 167
2/29/2004 133 0 300
3/31/2004 144 0 444
4/30/2004 155 0 599
5/31/2004 156 0 755
6/30/2004 136 1 890
7/31/2004 138 0 1028
8/31/2004 132 0 1160
9/30/2004 163 0 1323
10/31/2004 135 0 1458

Figure 2.

Power BI Desktop is a very versatile tool, there is more than one way of transforming the original table into the result we want. I could perform all the calculations and transformations in the Query Editor window using operations enabled by UI plus a little M programming language. But today I am going to show you how to do this in the Data Tools windows using DAX programming language. DAX runs on top of the blazingly fast column-based, in-memory Vertipaq engine, you are likely going to get the best possible performance playing with your data this way. Besides, in the current release of Power BI Desktop, DAX formula bar comes with powerful IntelliSense which makes authoring a DAX expression so much easier. For full disclosure, I have been a key contributor to DAX development from the beginning therefore I have the ulterior motive of promoting the language I helped created. Unlike Kathi in her blog, I am not trying to develop the fastest possible solution through clever language trickeries, instead, I am going to show you how to arrive at the final solution in a step-by-step fashion so you can see the intermediate results after each step.

If you want to try out the steps on your own from the very beginning and have a SQL Server readily available, please download the script SubscriptionList.sql from Kathi’s blog page to create sample data table in your SQL Server and then import the data into Power BI Desktop, otherwise, I have created a PBIX file, MonthlySubscriberCount1.pbix, that has already imported the data into a table ‘Registrations’ for your convenience. While I provided in this blog all DAX expressions for you to copy and paste, I encourage you to type them yourselves to learn how easy it is to enter DAX expressions with the help of IntelliSense.

First, let’s see the original data in the Data Tools window by clicking the Data button, see Figure 3.

ClickDataView

Figure 3.

In the table ‘Registrations’, we see a list of individual accounts and the dates they joined and left the subscription in the two columns [DateJoined] and [DateLeft]. The data in the table should be similar to Figure 1. While each account must have a real date for the [DateJoined] column, its [DateLeft] column can be empty if the account is still active. Since we want to summarize data at month level, let’s add two calculated columns to the table to show the months in which each subscriber joined and left. We will use DAX EOMONTH function to calculate the last day of the month corresponding to the [DateJoined] or [DateLeft] columns.

ClickNewColumn

Figure 4.

Click the New Column button as shown in Figure 4, and then enter the following DAX formula to add a calculated column called [MonthJoined].

MonthJoined = EOMONTH(Registrations[DateJoined], 0)

Now repeat the steps above to add another calculated column [MonthLeft].

MonthLeft = EOMONTH(Registrations[DateLeft], 0)

After we added the two columns, we have in the table both the date and the month (last day of the month to be precise) for which a subscriber joined or left, see Figure 5. We can now find out the total number of subscribers joined or left in each month.

AddedMonthJoinedLeft

Figure 5.

Click the New Table button as shown in Figure 6, and then enter the DAX formula shown below to add a calculated table called ‘MonthlyJoined’.

ClickNewTable

Figure 6.

MonthlyJoined =
SUMMARIZECOLUMNS(
	Registrations[MonthJoined],
	"CountJoined", COUNTROWS(Registrations)
)

Now repeat the steps above to add another calculated table ‘MonthlyLeft’.

MonthlyLeft =
SUMMARIZECOLUMNS(
	Registrations[MonthLeft],
	"CountLeft", COUNTROWS(Registrations)
)

We get two new tables in the model as shown in Figures 7 and 8 respectively. The two tables are also listed under Fields on the right side of the window, see Figure 9.

MonthlyJoined

Figure 7.

MonthlyLeft

Figure 8.

TwoNewTablesInFields

Figure 9.

The formulas of the two calculated tables we just created used two DAX functions, SUMMARIZECOLUMNS and COUNTROWS. SUMMARIZECOLUMNS is an extremely versatile and powerful function. It will take a few blog posts just to explore that function alone. But for today’s purpose, all you need to know is that if you simply supply the columns and calculations you care about as arguments to the function, it will return a new table with those columns and calculated values in the output. While the calculations can be arbitrarily complex DAX expressions, for today’s blog post we only need basic aggregations such as summing the values of a column or counting the number of rows in a table. Note that the columns listed as arguments in SUMMARIZECOLUMNS can be any column from any table in the current model, DAX engine will magically stitch them together by following the relationships defined between tables. If you look at the DAX expression for the ‘MonthlyJoined’ table, we supplied as input arguments one column ‘Registrations’[MonthJoined] and one calculation COUNTROWS(‘Registrations’). SUMMARIZECOLUMNS returns a table of columns. If an argument is a column, it automatically keeps the column name in its output, therefore, we only need to supply new column names for output columns based on calculations. In this case, we picked the name “CountJoined” for the calculation.

warningThe new column names for calculations are enclosed in double quotes, not square brackets.

COUNTROWS function simply returns the number of rows in its table argument. The magic in our example is that COUNTROWS doesn’t simply return a static count of all rows in the ‘Registrations’ table but rather a dynamic count of rows corresponding to the current value of the [MonthJoined] column or the [MonthLeft] column. If you want to dig deeper into how this magic happens, it will take some really technical blogs to explain so it’s for another time. But for most users, just list the columns and aggregations you want to see in the output, thing will work out just right.

Now if you switch to the ‘MonthlyLeft’ table, you should see a row with empty value for the [MonthLeft] column as shown in Figure 10, that’s because many subscribers never left so their [MonthLeft] value is blank.

BlankMonthLeft
Figure 10.
Let’s remove that row by modifying the definition of the table as shown below with the changes highlighted. The changes should be self-evident and expose one power of DAX – that DAX functions are composable. We filter the output table of SUMMARIZECOLUMNS row by row and remove the row with a blank value for the [MonthLeft] column.

MonthlyLeft = 
FILTER(
	SUMMARIZECOLUMNS(
		Registrations[MonthLeft],
		"CountLeft", COUNTROWS(Registrations)
	),
	NOT(ISBLANK(Registrations[MonthLeft]))
)

Although it’s great to see count of subscribes joined or left per month, the information is spread in two separate tables, we need to somehow combine the two tables into a single one so we can see the two counts side by side for a given month. For those of you with SQL background, you are likely thinking about searching for a DAX join function by now. While it is possible to write a DAX expression to join the two tables together, let me show you a different way that is faster and arguably more intuitive. We are going to leverage a fundamental feature of Power BI Desktop: creating relationships between tables.

Click Manage Relationships as shown in Figure 11, then click the New button as shown in Figure 12. In the Create Relationship dialog box, select table ‘MonthlyJoin’ and column [MonthJoined] as the first end of the relationship, select table ‘MonthlyLeft’ and column [MonthLeft] as the second end of the relationship. Click Advanced options to check whether Cross filter direction is set to Both by default and if not change it to Both as shown in Figure 13. Click the OK button to accept the new relationship and then click the Close button to close the Manage Relationships dialog box.

ClickManageRelationships

Figure 11.

ClickNewButton

Figure 12.

CreateRelationship

Figure 13.

Now we are ready to create yet another calculated table to bring the columns [CountJoined] and [CountLeft] together. Click New Table, shown in Figure 6, and enter the following DAX expression to create a new calculated table called ‘MonthlyJoinedAndLeft’. The newly created calculated table is shown in Figure 14.

MonthlyJoinedAndLeft =
SUMMARIZECOLUMNS(
	MonthlyJoined[MonthJoined],
	"CountJoined", SUM(MonthlyJoined[CountJoined]),
	"CountLeft", SUM(MonthlyLeft[CountLeft])
)

MonthlyJoinedAndLeft

Figure 14.

Since the new table only has a single month column, it no longer makes sense to call it [MonthJoined], so let’s just rename it to [Month]. We do that by expanding the ‘MonthlyJoinedAndLeft’ table on the right and then right click the [MonthJoined] column, select Rename from the dropdown menu as shown in Figure 15.

RenameColumn

Figure 15.

We are almost there. Let’s add a column to the newly created table to show the difference between the number of subscribers joined and the number that left. The DAX formula is:

CountNetJoined =
MonthlyJoinedAndLeft[CountJoined] - MonthlyJoinedAndLeft[CountLeft]

The table now looks like Figure 16.

AddedCountNetJoined

Figure 16.

We are ready to count the number of active subscribers by the end of each month. The math is to add up all net joined subscribers up to the current month. The DAX formula for the calculated column is

Subscribers =
SUMX(
	FILTER(
		MonthlyJoinedAndLeft,
		[Month] <= EARLIER([Month])
	),
	[CountNetJoined]
)

This formula is almost self-explanatory except for that mysterious EARLIER function. Without going into the gory details, it’s enough to say that for each row in the table the formula starts a new scan of the ‘MonthlyJoinedAndLeft’ table, keeps only those rows whose value of the [Month] column is on or before the value of the same column of the current row, and then sums up all the values of the [CountNetJoined] column. The final result is shown in Figure 17 and if you have tried out the T-SQL queries in Kathi’s blog you can verify that our results match the T-SQL ones.

AddedSubscribers

Figure 17.

Last but not least, we can’t call it Power BI without a nice chart to show off our calculations. Since the source data is generated, if we plot total subscribers per month we will see a boring linear growth of subscribers over time. So I created an Area Chart showing monthly count of subscribers joined and left respectively. The chart is shown in Figure 18, and since Power BI is very easy to use, I’ll leave it to you to figure out how to create the chart on your own. You can download the final MonthlySubscriberCount2.pbix file to examine all the calculated tables and their DAX formulas.

MonthlyChart

Figure 18.

Use Calculated Table to Figure out Monthly Subscriber Churn