This is the second article in a series on the introduction to the EvaluateAndLog function. As I explained last time, EvaluateAndLog can be used to print to the debug output the intermediate result of any DAX expression. In this article, we will learn how to interpret the output of the function. We first study the output of the function when the first argument is a scalar expression, which returns a single value, be it a number, a string, a datetime, etc. Next week we will cover the output when the first argument is a table expression.
All DAX expressions used here are based on a simple 3-table model shown in Figure 1. The tables and data are imported from one of Microsoft’s AdventureWorksDW sample databases. A link to the pbix file used for the examples is provided at the end of the article.
Let’s start with a simple measure, [Measure1], and add it to a table visual, Figure 2.
[Measure1] = EvaluateAndLog(Count(‘Product'[ProductKey]), “Measure1”)
In DAX Debug Output, there are two DAXEvaluationLog events, as shown in Figures 3 and 4. It means that the expression is calculated twice, once, Figure 3, for each color of the product, and the other, Figure 4, for the value of the total row. In case you wonder whether the two calculations are independent of each other, the answer is yes. Even though DAX Engine can derive the total value by simply adding all the per-color values, that’s not the correct way of calculating total values in the general case, e.g. when the aggregation function is a non-additive one such as DistinctCount. That being said, DAX Engine does have the ability to eliminate the separate calculation of the total value when the aggregation function is additive, e.g. Sum. But in the case of [Measure1], the presence of the EvaluteAndLog function caused DAX Engine to choose a less optimal but more general execution plan. This is the key reason that Microsoft enables the function only in Desktop for debugging purposes but not in production environments and why we advice users to remove obsolete instances of EvaluateAndLog in their DAX expressions whenever possible. When Microsoft is ready to enable the function in Power BI services, the product team will give users full control over the behavior of the function in different environments. By the way, whether or not the EvaluateAndLog function disables a particular optimization in DAX Engine is subject to changes at the discretion of the product team. We will discuss more about the interactions between the EvaluateAndLog function and various DAX optimizations in a future article.
If you compare Figures 2 and 3 carefully, you can see that the numbers in both figures match but the order of product colors is different. That’s because DAX Engine calculates the values of [Measure1] first and then sort the rows in a separate step afterwards. The EvaluateAndLog function records the results when the values are calculated but before the output rows are sorted. Luckily, you can click any header in the data grid of DAX Debug Output to sort the rows by that header.
If you inspect the output of EvaluateAndLog in JSON format, you can see a property named “inputs”, the value of which is a list of column names. There is also a “data” property whose value is a list of “input” and “output” pairs. The value of the “input” property is a list of values corresponding to the columns in the “inputs” property. The value of the “output” property is the calculated value of the DAX expression for the given “input” values. DAX Debug Output uses different header colors to help users tell apart input and output columns in the data grid. But what are those input columns? Where do they come from? The answer is that they are the set of columns in the evaluation context that affect the results of the calculation and that the columns generate multiple combinations of unique set of values, each of which would cause the scalar expression to calculate a new value in the intermediate result.
To better understand input columns, let’s change the visual to a matrix and then add the [Education] column from the ‘Customer’ table to Columns, Figure 5, If you examine the DAXEvaluationLog events from the matrix query, you will see that they stay the same as in Figures 3 and 4. That’s because [Measure1] counts the [ProductKey] column in the ‘Product’ table therefore is not affected by columns from the ‘Customer’ table. As a result, the ‘Customer'[Education] column is not counted as an input column to the Count(‘Product'[ProductKey]) expression although it is a part of the matrix query. What if we add a filter ‘Product'[Size] = “L” to the visual? There is no change to the number of rows in the output of the DAXEvaluationLog events even though the filter on the ‘Product'[Size] column does change the values of the calculation. As a rule of thumb, input columns most likely come from columns added to visual axes or columns from table parameters of DAX iteration functions.
Let’s use a more complex example to see how we can leverage the EvaluateAndLog function to solve one of the puzzles often encountered by students of DAX. We define a measure, named [Measure2], to calculate the average number of unique products purchased on each day.
[Measure2] = AverageX(Values(‘Sales'[Order Date]), EvaluateAndLog(DistinctCount(‘Sales'[ProductKey]), “daily unique products”))
When added to a card visual, Figure 6, the value of the calculation doesn’t look right. Even without examining the data in the ‘Sales’ table, it’s highly suspicious for an average calculation to return a perfect whole number.
Before we inspect the intermediate result labeled “daily unique products”, let’s first predict what we should see. For the AverageX function to calculate an average value, it needs a list of numbers, one per day because the first argument is a single-column table of distinct order dates, therefore we should see a list of dates and corresponding numbers of counts. But the actual output, Figure 7, is surprisingly a single value 158. It tells us that the expression, DistinctCount(‘Sales'[ProductKey]), doesn’t depend on any input column, in particular not the ‘Sales'[OrderDate] column, therefore it returns the same value for all the rows of the table expression Values(‘Sales'[OrderDate]). That explains why the output value is a whole number as the average of the same number, regardless the number of dates, is the number itself.
For those who already understand the difference between row context and filter context, they know this is where the famous Calculate function is needed to turn row context into filter context. We fix the bug in the DAX expression by wrapping the Calculate function around the DistinctCount function. Afterwards, DAX Debug Output displays the expected list of values per day, Figure 8, and the card visual, Figure 9, shows a more reasonable average value of 21.17.
[Measure2] = AverageX(Values(‘Sales'[Order Date]), EvaluateAndLog(Calculate(DistinctCount(‘Sales'[ProductKey])), “daily unique products”))
This quick debugging exercise shows us that you can use the EvaluateAndLog function to see the mysterious evaluation context in action. Input columns are an important, and often difficult to understand, part of the evaluation context. Now we have a way to visualize them.
DAX Debug Output colors the headers of the input columns orange and the header of the output column maroon. When there are multiple input columns, DAX Debug Output alternates dark and light orange colors on their headers, Figure 10.
Today, we learned how to interpret the output of the EvaluateAndLog function when its first argument is a scalar expression. Key takeaways from today’s lesson are:
- A scalar expression is calculated against the combinations of values of a set of input columns.
- An input column is a column in the evaluation context that produces a set of distinct values, each of which causes the scalar expression to calculate a new value.
- An input column is typically one from the visual axes or is added by a DAX iteration function.
- A pure filter column in the evaluation context is not considered an input column even if it changes the result of the calculation because its distinct values don’t cause the scalar expression to be evaluated multiple times.
- For a given scalar expression, when there is no input column, you will see a single value in the debug output; when there are input columns, you will see multiple values, one per unique combinations of the input column values.
- Intermediate results are not sorted. You can use DAX Debug Output’s sorting feature to help see the results.
- For a given query, the same EvaluateAndLog expression can generate multiple DAXEvaluationLog events, one per unique combinations of input columns as specified in the query.
- The pbix file used in today’s examples can be downloaded from here.
- The current version of DAX Debug Output is here. You must have admin rights on the machine and agree to install the software from an unknown publisher. You download both DAXDebugOutputSetup.msi and setup.exe to your machine and run setup.