EVALUATEANDLOG AND DAX ENGINE OPTIMIZATIONS

This is the fourth and the last article on the EvaluateAndLog function. While DAX developers were excited about the prospect of printing the values of a portion of a DAX expression, many of the early adopters of the function were puzzled by a mysterious message they ran into from time to time: “EvaluateAndLog function is not executed due to optimization.” Another question on people’s mind is “While it’s great to be able to see the values of a sub-expression used in my calculation, is there a way for me to see the values of other expressions which are not needed to calculate the final result?” I will provide answers to both questions in this post.

For today’s DAX examples, I will be building on top of the three-table (‘Sales’, ‘Customer’, ‘Product’) model I used last time and will provide a link to the final pbix file at the bottom.

Let’s start with a simple DAX measure

Measure6 = SumX(‘Sales’, EvaluateAndLog(‘Sales'[Sales Amount]))

If you add [Measure6] to a card visual and expect to see the list of values from the [Sales Amount] column that contribute to the sum, you will see instead what seems like an error message, Figure 1.

Figure 1.

The message tells users that instead of first calculating the ‘Sales'[Sales Amount] sub-expression and then calculating the sum, the DAX Engine was able to send the entire SumX as a single calculation to the storage engine, either the Vertipaq Engine or the DirectQuery Engine. In other words, instead of reading all the rows from the ‘Sales’ table into the DAX Engine, and then adding the values of the [Sales Amount] column one row at a time, the DAX Engine simply asks the storage engine to return the sum value directly. This implies that the EvaluateAndLog function is confined within the DAX Engine and never pushed down to the storage engine. For those who are confused by the separation of the DAX Engine and the Vertipaq Engine because in their mind the two work together as the Power BI Engine, it would be easier for them to imagine a DirectQuery model where all three tables are hosted in a relational database. There is obviously no effective way for the DAX Engine to push EvaluateAndLog into a single SQL query to retrieve a set of intermediate results in addition to the requested aggregation. If the DAX Engine is able to push a bigger expression down to the storage engine, any EvaluateAndLog embedded within will not be able to serialize intermediate values while inside the storage engine therefore is optimized away.

The DAX Engine has implemented a large number of optimizations. Pushing as many operations as possible down to the storage engine is one of the most important ones. Here is another example:

Measure7 = Calculate(
    Sum(‘Sales'[Sales Amount]),
    EvaluateAndLog(‘Product'[Color] = “Red”)
)

If we add [Measure7] to a card visual, we will receive the same notice as in the case of [Measure6]. That’s because the DAX Engine is able to fold the entire Calculate expression into a single storage engine request that joins the ‘Sales’ table with the ‘Product’ table, applies the [Color] = “Red” filter, aggregates the values of the [Sales Amount] column, and finally returns a filtered sum value. Since the DAX Engine is more likely to optimize commonly written expressions, users will run into the special notice more often on frequently used expression patterns.

An expression alone does not determine whether or not an optimization happens, the evaluation context also influences the decision. Let’s use an example adapted from a very good question asked by David Bianconi on LinkedIn by defining a measure as

Measure8 = Calculate(
    Sum(‘Sales'[Sales Amount]),
    EvaluateAndLog(
        CalculateTable(‘Customer’, ‘Customer'[Education] In {“College”, Partial College”})
    )
)

Add the ‘Customer'[Education] column and [Measure8] to a table visual and we will get Figure 2. DAX Debug Output receives two DAXEvaluationLog events because [Measure8] is calculated twice, once in the context of the rows of the distinct values of the ‘Customer'[Education] column and the other in the total row. EvaluateAndLog serialized the table value in one event, Figure 3, but was optimized away in the other, Figure 4. Clearly the DAX Engine has decided that it’s able to optimize in one case but not the other because the contexts are different.

Figure 2.
Figure 3.
Figure 4.

Since optimizations are essential for reasonable query performance in many basic scenarios, the product team had chosen to retain some optimizations despite the presence of EvaluateAndLog. We may turn on even more optimizations in future releases to prevent EvaluateAndLog from causing disastrously expensive calculations. But we will reconsider the decision if we hear strong feedbacks from the community that users would rather see the intermediate result despite the potential risk.

One of the obvious optimizations in the DAX Engine is that the expression of an unused variable is not evaluated. But the product team has decided to make an exception for the EvaluateAndLog function. If the entire expression of an unused variable is wrapped in EvaluateAndLog, the expression will be evaluated! This feature enables DAX developers to print the values of arbitrary DAX expressions even though they don’t contribute to the final result. One can easily imagine possible applications of this feature. We can now print any contextual information in the middle of a DAX expression.

We’ll end this article with an example that takes advantage of the unused variable trick. If we define [Measure9] as

Measure9 = AverageX(
    EvaluateAndLog(Values(‘Sales'[Order Date])),
    Calculate(DistinctCount(‘Sales'[ProductKey]))
)

and add the above measure to a card visual we will see that EvaluateAndLog is optimized away. This one maybe hard to understand because the DAX Engine wasn’t able to push the entire average of distinct counts down to the Vertipaq Engine. On the other hand, the DAX Engine clearly needed to first calculate the distinct counts by different values of the [Order Date] column before it could calculate the average. As it turns out, the AverageX function is so smart that instead of getting the list of dates and the list of distinct counts in two separate steps, it’s able to combine them into a single step hence optimizing away a dedicated step just to retrieve the dates. That’s clever and all but it doesn’t help us see the list of dates if we really want to. To get what we want we can introduce an unused variable. If we rewrite [Measure9] as

Measure9 =
Var _unused = EvaluateAndLog(Values(‘Sales'[Order Date]), “_unused”, 1000)
Return
AverageX(Values(‘Sales'[Order Date]), Calculate(DistinctCount(‘Sales'[ProductKey])))

We will see all the dates that had product sales, Figure 5.

Figure 5.

Today we examined the interactions between the EvaluateAndLog function and optimizations inside the DAX Engine. Key takeaways from today’s lessons are:

  • The DAX Engine performs many optimizations by reducing a multi-step calculation into a single-step one.
  • The optimizations in today’s article are strictly from the perspective of the DAX Engine. A single-step calculation in the DAX Engine may take many steps to compute in storage engines.
  • The EvaluateAndLog function is confined to the DAX Engine and maybe optimized away, in which case a special notice is included in the JSON output of the DAXEvaluationLog event.
  • The DAX Engine will calculate the expression of an unused variable if its outmost function is EvaluateAndLog.

External Links

Download the latest version of Power BI Desktop if you want to try out today’s examples.

The pbix file used in today’s examples can be downloaded from here.

The current version of DAX Debug Output is here. You must have admin rights on the machine and agree to install the software from an unknown publisher. You download both DAXDebugOutputSetup.msi and setup.exe to your machine and run setup.

EVALUATEANDLOG AND DAX ENGINE OPTIMIZATIONS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s