Introducing DAX Window Functions (Part 1)

The December 2022 release of Power BI Desktop includes three new DAX functions: OFFSET, INDEX, and WINDOW. They are collectively called window functions because they are closely related to SQL window functions, a powerful feature of the SQL language that allows users to perform calculations on a set of rows that are related to the current row. Because these functions are often used for data analysis, they are sometimes called analytical functions. In contrast, DAX, a language invented specifically for data analysis, had been missing similar functionalities. As a result, users found it hard to write cross-row calculations, such as calculating the difference of the values of a column between two rows or the moving average of the values of a column over a set of rows. Oftentimes even if there is a way to perform such calculations, the resulting expressions are convoluted and cause DAX Engine to consume too much time and memory, therefore, don’t scale to larger number of data points. For these reasons, the DAX product team is super-excited to present the first batch of window functions as an early Christmas gift to the DAX community. Like their SQL counterparts, the DAX window functions are powerful yet more complex than most other DAX functions therefore require more effort to learn. In this blogpost, I’ll describe the syntax and semantics of these functions with a couple of concrete examples so that you can have the correct mental model when you work with these functions. In my next blogpost, I’ll dive deeper under the cover to expose some of the inner workings of the these functions to help you design your own solutions with good performance.

All examples will be based on the Adventure Works DW 2020 DAX sample model.

A taste of DAX window functions

Let’s jump right in and create the first report using the OFFSET function.

  • First add columns ‘Customer'[Customer], ‘Date'[Date], and measure [Total Quantity] to a simple table report.
  • Next apply a filter to limit the rows to customers with multiple sales dates.
  • Now come the interesting part: define a measure [Curr – Prev] that, for each customer and sales date, calculates the delta of [Total Quantity] between the current sales date and the previous sales date.
  • Add [Curr – Prev] to the table to see the result in Figure 1.
  • It can be easily seen that the delta values are all correct. I have also verified that the total value, 100, is the sum of all the delta values.
Curr - Prev =
    [Total Quantity] -
    CALCULATE(
        [Total Quantity],
        OFFSET(
            -1,
            SUMMARIZE(ALLSELECTED('Sales'), Customer[Customer], 'Date'[Date]),
            ORDERBY('Date'[Date]),
            KEEP,
            PARTITIONBY(Customer[Customer])
        )
    )
Figure 1.

Advantages of the window functions

When the OFFSET function was first leaked a couple of months ago, some users questioned its usefulness. They argued that they could achieve the same results using existing DAX functions such as time intelligence functions or setting appropriate filters in the CALCULATE function in the following fashion:

CALCULATE(
    ...,
    VAR Curr = VALUES([OrderByColumn])
    RETURN
    FILTER(ALL([OrderByColumn]), [OrderByColumn]=Curr-1)
)

But window functions are more generic and powerful in that

  • The order-by values don’t need to be continuous.
  • Users can order by multiple columns, e.g. first by [Year] and then by [MonthNumberOfYear].
  • Users can divide rows of a table into partitions and then sort the rows in each partition independently. You can see from our first example that the [Date] values are different in the two partitions in Figure 1.
  • Window functions offer simpler, consistent syntax and better performance than previous solutions using the FILTER function. For those who have studied computer science, the difference in time and space complexity is O(N*Log(N)) or even O(N) for window functions vs O(N2) for hand-crafted FILTER expressions.

Syntax of window functions

Although each window function has its own unique signature, they all follow the same pattern and share common parameters as shown in Figure 2. <row-selection> is the only required parameter(s) that defines the unique characteristics of each function. See Figure 3 for the different ways to select rows from a partition.

Figure 2.
Figure 3.

Any function that takes more than a couple of input parameters appear daunting to the users. For that reason, window functions may look too complex to beginners. But the good news is that most of the parameters are optional. Moreover, the DAX product team has made all optional parameters skippable even if they are not the last parameter. As long as you don’t mind some red-squiggly in the Desktop formula bar, the following DAX expressions are all accepted by the DAX Engine.

Some of the shortcut syntaxes are not yet accepted in the formula bar, therefore you will see the red-squiggly underneath, but they are valid according to DAX Engine. More work is planned for them to be accepted by the formula bar in a future release.

The KEEP keyword in between <order-by> and <partition-by> can be omitted.
e.g. OFFSET(-1, ALLSELECTED(‘Date’), ORDERBY([Date]), PARTITIONBY([Fiscal Year]))
<table> can be omitted if <order-by> is present.
In this case all columns in <order-by> and <partition-by> must belong to the same table.
e.g. INDEX(1, ORDERBY(‘Date'[Date]), PARTITIONBY(‘Date'[Fiscal Year]))
<order-by> can be omitted.
In this case DAX Engine will automatically inject order-by columns.
e.g. WINDOW(2, ABS, -2, ABS, ALL(‘Date’), PARTITIONBY([Fiscal Year]))
<from-type> and <to-type> can be omitted in the WINDOW function.
In this case the type is defaulted to REL.
The formula bar already supports skipping these parameters completely.
e.g. WINDOW(-3, -1, ALL(‘Date’))
Shortcut syntaxes

How do window functions work

The list below describes the logical steps performed by each window function:

  1. Take all rows of the table as specified by the <table> parameter.
  2. Divide the rows into separate partitions by the unique values of the partition-by columns.
  3. Sort the rows within each partition according to the order-by columns and sorting directions.
  4. Determine the current partition and, if necessary, the current row within the partition.
  5. Return zero, one or more rows within the current partition.
    • OFFSET returns 0 or 1 row at a certain distance from the current row.
    • INDEX returns 0 or 1 row at a fixed position in the current partition.
    • WINDOW returns all the rows in between a lower bound and an upper bound. Either bound is a row at a certain distance from the current row or at a fixed position in the current partition.

In addition to the general rules listed above, there are some special, yet common use cases, which are worth calling out:

  • When the <table> parameter is omitted, DAX Engine derives the table expression from the order-by and partition-by columns as ALLSELECTED(<order-by columns>, <partition-by columns>). In this case all columns must be from the same table.
  • When the <partition-by> parameter is omitted, the entire table is treated as a single partition.
  • When the <order-by> parameter is omitted, DAX Engine will order by all the columns in the table. This is convenient when there is only a single column in the table, but not recommended when there are more than one column, in which case it’s a good practice to explicitly specify the order-by columns so the sort order is fully controlled by the user.
  • When the user-specified order-by columns are insufficient to determine the order of all the rows, i.e. there can be ties among some rows, DAX Engine will automatically append additional order-by columns from the table until total order is achieved. If this is not possible because the table has no key columns therefore there maybe duplicate rows, DAX Engine will return an error. Users should always provide enough order-by columns to achieve total order if they want to have full control.

How is the current partition or the current row determined

Since all window functions depend on the current partition and/or the current row to work properly, how do they know what the current partition or the current row is? In comparison, a SQL query always scans a set of rows in the FROM clause so the current row is natural for SQL window functions. On the DAX side, iteration functions such as SUMX, FILTER, SELECTCOLUMNS, GENERATE, etc. also work one row at a time, a window function could leverage that to determine the current partition and current row. For example, when I was writing this blog, someone asked how to filter a given table of sales over dates to keep only those rows with consecutive sales above a threshold. This is a very typical business problem for window functions to solve. If we extract one partition, corresponding to ‘Customer'[Customer] = “Antonio Bennett”, from the table in Figure 1, we get the table in Figure 4. If we want to find out consecutive rows where [Quantity] >= 2, i.e. the highlighted rows, we could use the following DAX query to achieve the result:

DEFINE
VAR _Table = 
    SUMMARIZECOLUMNS(
        'Date'[Date], 
        TREATAS({"Antonio Bennett"}, 'Customer'[Customer]), 
        "Quantity", [Total Quantity]
    )
EVALUATE
    FILTER(
        _Table,
        [Quantity] >= 2 &&
        (
            SELECTCOLUMNS(OFFSET(-1, _Table, ORDERBY([Date])), [Quantity]) >= 2 ||
            SELECTCOLUMNS(OFFSET(1, _Table, ORDERBY([Date])), [Quantity]) >= 2)
        )
ORDER BY [Date]
Figure 4.

In the DAX query above, both the FILTER function and the two OFFSET functions scan the rows from the same table variable, _Table, so it’s pretty easy to see that the two OFFSET functions would use the current row added by FILTER to the row context. In general, there is an evaluation context when a window function is calculated, so the window function will derive the current partition and the current row from its evaluation context.

To determine the current partition, a window function would look for the partition-by columns from row context or from the grouping columns in the filter context. If a match is found, it will use the values of those columns in the context as the key for the partition.

Similarly to determining the current row, a window function would use the same strategy but this time checking for all columns from the <table> parameter. To be precise, only model columns from <table> are considered because only model columns can be added to the filter context.

Figure 5 illustrates how an OFFSET function fetches column values from filter context or row context to determine the current partition and the current row within the partition. Please note that all columns from the <table> parameter are used to identify the current row regardless which columns are needed for order-by. In this example, the [IsHoliday] column isn’t a part of order-by or partition-by but is still needed to identify a row. Of course DAX Engine may optimize [IsHoliday] away from the key columns needed to identify a current row because there is a functional dependency between [IsHoliday] and [Date] but that’s a performance improvement that won’t alter the result of the logical model.

Is it possible for a current row from the context to not have a match to a row from the <table>? Absolutely, the rows from the outer context and the rows from the <table> parameter of a window function are independent of each other in the general case. There can be more rows in the context than rows from the inner table, or vice versa. This is very different from SQL window functions which are tightly coupled with the main query.

Figure 5.

What if there is no current partition or current row

In the previous example, we are lucky in that the evaluation context happens to have all columns necessary to identify the current row. But what if some columns are missing, or, worse yet, all columns are absent? DAX expressions must work in all contexts, that’s the fundamental reason that measures can be reused in any report. The DAX engine team has implemented a concept, called apply semantics, to window functions so that they not only don’t fail when there isn’t enough information in the evaluation context to identify a unique partition or row but they even return meaningful results in common scenarios. The name, apply semantics, was inspired by the CROSS APPLY operator of T-SQL.

Below is the logic of the apply semantics assuming a window function requires the current row:

  1. Divide all columns from <table> into those that can bind to the evaluation context and those that cannot.
  2. Build an iterator to return rows from the outer context corresponding to the bound columns in #1.
  3. For each row R1 from #2, build an iterator to return all possible rows corresponding to the unbound columns in #1 that exists with the R1. For example, if [Column] is the only unbound column, iterate over all rows returned by VALUES([Column]).
  4. For each row R2 from #3, combine R1 and R2 into R. R has values for all columns in #1, therefore is a valid current row.
  5. Use R to locate the current partition and then the current row within the partition.
  6. Calculate the result row(s) as defined by the semantics of the window function.
  7. Output the row(s) as long as it has not been output already.

As you can see, apply semantics effectively enumerates all valid current partitions/rows in a given context, calculates the regular output of a window function for each partition/row, and then returns the union of the output without duplicates. As a result, OFFSET and INDEX may return more than one row, and WINDOW may return more rows than the size of the window. Going back to Figure 1, in order to calculate the value of [Prev – Curr] in the grand total row, DAX Engine iterates over all valid combinations of [Customer] and [Date], shifts to the previous [Date] value within the given [Customer] partition, and in the end outputs all rows in the table except for the last rows in each partition and then use that as a filter to calculate the difference between the values of measure [Total Quantity] with and without the filter.

The apply semantics poses a potential performance risk when the unbound columns come from different dimension tables therefore may produce a big cross-join in contexts without sufficient filters. DAX authors should pay close attention if they use window functions in this fashion.

Summary

The advent of window functions in DAX opens a floodgate of opportunities for Power BI users to solve complex, even previously intractable, problems in an efficient, uniform, and elegant way. They can be used to perform a wide range of calculations on sets of data, e.g.

  • Compute running totals and running averages
  • Find best and worst performers
  • Investigate trends across time
  • Calculate contributions to the whole, such as commission percentages

Moreover, this is just the initial release with limitations and known issues. The DAX product team is actively working on additional improvements and new features to enrich this category of functions so users can expect more exciting capabilities to arrive in the near future.

Advertisement
Introducing DAX Window Functions (Part 1)

12 thoughts on “Introducing DAX Window Functions (Part 1)

  1. Nick says:

    Do you have any performance tips for using a slicer to control the delta in OFFSET or the from/to in WINDOW? If I place something like SELECTEDVALUE( RangeSelect[Value] ) within OFFSET or WINDOW, it works. But if I use it across multiple measures that are summing different columns in a single fact table, a separate Storage Engine query is generated for every measure. When I remove SELECTEDVALUE and replace it with an integer, all of the measures are calculated from a single SE query.

    Like

    1. What you described is missing fusion optimization when the row selection parameters are defined by expressions instead of constants. I’ll discuss with the engineers on adding fusion support in this case.

      Like

  2. Hi Jeffrey, very very nice article!
    One question: does this portion of the article “Similarly to determine the current row, a window function would use the same strategy but this time checking for all columns from the parameter. To be precise, only model columns from are considered because only model columns can be added to the filter context.” imply that it will never be possible, through window functions, to use the value of a measure to determine the sorting? I tried the following syntax for a measure to put in a matrix with Product[Color] in rows and where [Sales] is also measure:
    Sales OFFSET -1 Color =
    IF (
    HASONEVALUE(‘Product'[Color]),
    CALCULATE(
    [Sales],
    OFFSET(
    -1,
    ADDCOLUMNS(
    ALLSELECTED(‘Product'[Color]),
    “@ColorSales”,[Sales]
    ),
    ORDERBY([@ColorSales],DESC)
    )
    )
    )
    but got an error and had to use the old good but rather complex approach like
    Sales OFFSET -1 Color =
    VAR RankSalesCurrentColor =
    RANKX ( ALLSELECTED ( ‘Product'[Color] ), [Sales] )
    VAR TableRankSalesColor =
    ADDCOLUMNS (
    ALLSELECTED ( ‘Product'[Color] ),
    “@RankSalesColor”, RANKX ( ALLSELECTED ( ‘Product'[Color] ), [Sales] )
    )
    RETURN
    IF (
    HASONEVALUE ( ‘Product'[Color] ),
    CALCULATE (
    [Sales],
    FILTER ( TableRankSalesColor, [@RankSalesColor] = RankSalesCurrentColor – 1 )
    )
    )

    Thanks
    Francesco

    Like

    1. Currently, window functions only allow sorting by columns existing in the model, you try to sort by a column created in the middle of the DAX expression. We are working on lifting this constraint.

      Like

  3. Bob Slattery says:

    I have not had a use case for INDEX but, WINDOW works very smoothly. I used it for managing Sales Order Flow, where the status of the order lines change as they move through the Order to Dispatch/Invoice process. Each change in status is logged in the fact table. It was very easy to index the Order Line flow events using RANKX and WINDOW. The semantics are very natural.

    OFFSET on the other hand, is very confusing. I finally figured out (or at least I think I did) that the preservation of the data lineage of the output row is the issue. The measure running in the OFFSET filter context, will reference the row output from OFFSET, not the one input. This means that, if Dimensions are used for PARTITIONBY and ORDERBY (which the bloody-well should be IMO), only measures, or columns in the Fact where the value for the input and output are the same, will return sensible values in the visual.

    If for example, I include physical columns from the dimensions along with physical columns from the fact, like the date of an order line status event for example, I get lines in the visual where the dimension key does not match the fact FK! This is enormously confusing.

    Like

  4. Ted Murphy says:

    Problem with OFFSET:

    Window and Index functions work perfectly on a data set that has similar structure as AdventureWorks … Sales Table linked to Customer and Dates.

    However I get the following error message when attempting to use Offset …

    Mdxscript(Model) (369,25) Calculation error in measure ‘Calcs'[Prev Qty]: OFFSET’s columns that do not have any matching row context or filter context column must all be from the same table. The columns ‘Dates'[Datekey] and ‘Customers'[Customer] do not satisfy these requirements.

    Prev Qty = CALCULATE(
    [Qty],
    OFFSET(
    -1,
    SUMMARIZE(ALLSELECTED(Sales), Customers[Customer],Dates[Datekey]),
    ORDERBY(Dates[Datekey]),
    KEEP,
    PARTITIONBY(Customers[Customer])
    )
    )

    This error occurred when adding Prev Qty to a visual that had Window Qty and Index Qty successfully displayed.

    Like

    1. Ann says:

      Hi Jeffrey,
      Thank you for the articles, they’ve been very-very helpful.

      I’m also having the issue with OFFSET that Ted Murphy described (..OFFSET’s columns that do not have any matching row context or filter context column must all be from the same table..)

      Any chance you know in which release the error was fixed? I’m having Version: 2.112.1161.0 64-bit, Released: January 10, 2023 which seems the latest..

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s