Running Total Function in GoodData Platform (Sum vs Runsum)

  • 1 September 2021
  • 6 replies
  • 333 views

Userlevel 1

Running Total Functions

Running Total Functions allow you to keep track of a metric from the beginning of a period until now or calculate the statistics in-betweens rows, such as total revenue year-to-date or sum of revenue between the last 2 periods and now. In the GoodData Platform, you may use this function to calculate sum, average, minimum, maximum, variance, and standard deviation with the following syntax:

 

Running Total Function

Syntax

Sum

runsum()

Average

runavg()

Minimum

runmin()

Maximum

runmax()

Variance

runvar()

Standard Deviation

runstdev()

Variance (Uncorrected)

runvarp()

Standard Deviation (Uncorrected)

runstdevp()

Table 1: List of Running Total Functions available in GoodData Platform

 

The Running Total Functions are a great tool to calculate moving averages or accumulated summation. In this article, we will explore how to use the running total function on defining metrics in the GoodData Platform with the logical data model (LDM) is below:

Diagram 1: LDM of Sales

 

Sum vs.Runsum

What makes runsum different from sum is that runsum accumulates the summation made in the previous period or dimension. To demonstrate the difference, let's define 2 metrics using both syntaxes to calculate the revenue:

 

Select SUM(Price*Quantity) where Order Status = Delivered - Syntax 1

Select RUNSUM(Price*Quantity) where Order Status = Delivered - Syntax 2

 


Diagram 2: Revenue (Syntax 1) in Metric Editor

 

Diagram 3: Total Revenue since First Month (Syntax 2) in Metric Editor
 

I have named the metrics with Revenue and Total Revenue since First Month for Syntax 1 and Syntax 2, respectively. Let’s put these metrics in a table in Analytical Designer segmented by Date and observe the difference between Syntax 1 and Syntax 2:

Diagram 3: Sum vs. Runsum Sliced by Year

Once the data is segmented, we can see the Total Revenue since First Month accumulates the previous month's revenue. In contrast, Revenue only displays the revenue of the respective month/year. We did not specify the timeframe for Syntax 2 to accumulate the revenue. Therefore, Total Revenue since First Month accumulates the revenue from the first available period, Oct 2017. 

And, of course, it works the same with other statistical metrics. If you are interested in minimum or maximum revenue, replace runsum with runmin or runmax.

 

Running Total Within a Specific Timeframe

A limitation of Syntax 2 is that the metric includes all historical data; it may be more useful to calculate the given metric within a specific timeframe for reporting purposes. For example, I think the user may be more interested in daily revenue and Year-to-date accumulated daily revenue (Summation of revenue from the beginning of the year to the current date); it means we have to modify Syntax 2 only to sum the daily revenue within each year. To do this, we need to specify the timeframe using the keyword within() the syntax. The keyword only accepts date attributes. Syntax 3 below is the modified syntax for year-to-date revenue:

Select RUNSUM(Price*Quantity) within(Year(Date)) where Order Status = Delivered - Syntax 3

 

Diagram 4: Total Revenue YTD (Syntax 3) in Metric Editor
 

Diagram 5: Revenue and Total Revenue YTD

 

As we expected, the metrics only accumulate the revenue within each year, resetting the calculation in January 2018. Total Revenue YTD can be segmented in other date granularity lower than Year, like Quarter: 

 

Diagram 6: Revenue and Total Revenue YTD sliced by Quarter

However, this functionality does not work when segmented by the same date granularity (like year in our example) or any non-chronicle date granularity (Like month of year or day of month).

 

Rolling Window

GoodData Cloud offers the ability to calculate statistics within a window frame, such as moving averages. The format of the rolling window is Select runsum(<attribute/fact/metric>) Rows between <Specific row(s)> and <Specific row(s)>. You may place N preceding, N following, current row to represent N previous rows, N following rows, and current row, respectively. For example, if I would like to have 5 days moving average of revenue (Take an average of the last 4 days and today), I may have the following MAQL:

 

Select RUNAVG(Revenue) Rows Between 4 Preceding and Current Row - Syntax 4

 

Diagram 7: Revenue 5-Days Moving Average (Syntax 4) in Metric Editor
 

This MAQL query takes the average of revenue defined in Syntax 1 for the last 5 days (Including today). Here is what a line chart will look like if we limit the data to only this year:

Diagram 8: Revenue 5 Days Moving Average

 

The blue line represents the actual daily revenue, while the orange line represents the 5 Days moving average revenue. As we can see in Diagram 4, the orange line is smoothened and displaying the 5 days trend of revenue. You may also combine the within keyword to limit the time frame within a specific date granularity. If we are interested in obtaining the 5 days moving average using only the same year data, we may modify Syntax 4 to the following:

Select RUNAVG(Revenue) within(Year(Date)) Rows Between 4 Preceding and Current Row - Syntax 5

 

Conclusion

Running Total Function is a great way to calculate a metric across a specific time frame. We have explored the difference between sum and runsum and how flexible it is to calculate your metrics with different time frames. 

 

Syntax format:

Select RUN___(<metric>) WITHIN (<Date Granularity>}) Rows Between _____ and ____ where <attribute/metric/fact> = <some condition>


 

(Tip) Note on Calculating Moving Average

Going back to our example on calculating Revenue 5 days moving average, you may realize the flaw of the MAQL query on Syntax 5 is that the entries of the first 4 days do not fully take 5 days of average (ie, the moving average for the entry on Jan 1, 2020 does is the same as the revenue), like below:

Diagram 9: The flaw of Syntax 5

 

It happens because the entries needed are filtered out. In order to fix, you need to include “WITH PF Except Date” in Syntax 5, so that the metric is able to include the entries from the previous year in the calculation, with the following Syntax:

Select RUNAVG(Revenue) Rows Between 4 Preceding and Current Row with PF except Date(Date)- Syntax 6
 

Diagram 10: Syntax 6 in Metric Editor

 

Once you have utilized the new metrics, it will look like this:

Diagram 11: The 5-days Moving Average with the Correct Calculation

 

Once you have created this insight, be sure to test whether this is the desired metric you would like to display on a dashboard and adjust the filtering dependence between this insight and the dashboard.

 

Reference
Documentation of Running Total Function:
https://help.gooddata.com/doc/enterprise/en/dashboards-and-insights/maql-analytical-query-language/maql-expression-reference/granularity-keywords/running-total-functions/running-totals-for-core-functions

https://help.gooddata.com/doc/enterprise/en/dashboards-and-insights/maql-analytical-query-language/maql-expression-reference/granularity-keywords/running-total-functions


 


This topic has been closed for comments

6 replies

Hey Jacques and thank you for the article!

I have a challenge to calculate a TTM Revenue (trailing twelve months). I created a metric with the following syntax:
SELECT RUNSUM (select Inv. Amount (Converted) Where Month/Year (Worked Date) BETWEEN THIS - 11 AND THIS)

But when I'm adding a report into dashboard, GD cuts the data and displays only month (depends on the Data filter)

Basically, if I take your Diagram 5: Revenue and Total Revenue YTD, as an example, I need the sum Revenue for the period (11-2017 - 10-2018). And by using a Data filter (Month/Year (Worked Date)) get different values for the last 12 months (incl. selected month).

Please advise. Thanks, 
/Yurii B

Userlevel 1

Hello Yurii,

Thank you for reaching to me and I am happy to help!

When using runsum, we should not use where clause to limit the timeframe. If you want to calculate TTM Revenue in my example, the easiest way to do is declare the following metric and put it in an insight and sliced by Month/Year:

select runsum(revenue) rows between 11 preceding and current row with pf except Date(Order Date)

This syntax allows you to add 12 revenue entries (Or 12 months) when you slice by Month/Year. But you must have the insight sliced by Month/Year in order to work. The “pf except Date” allows the metric to work even if you have data filter in the insight.

Please let me know if it works on your side.

Jacques

Thanks for a quick reply but it doesn't work on my end. And I can't use the pf except option, as I need to roll data using this attribute on board to get different TTM figures 🙂 The row data coming from the stores on daily basis. In this case, I assume the Row in this syntax will mean each Day, and not Month as I need. 

I did a very simple formula in the past, which just add up twelve month in row. But is there a smart way of doing the same? :thinking:

SELECT ( (SELECT REVENUE) + (SELECT REVENUE FOR Previous (Month (Worked Date))) + (SELECT REVENUE FOR Previous (Month (Worked Date), 2)) + (SELECT REVENUE FOR Previous (Month (Worked Date), 3)) + …

/Yurii B

Userlevel 1

Hello Yurii,

I have sent you a private message to follow up your question but I have not received your reply yet. Sorry for the late follow up. I don’t think MAQL allows you to sum/average with a rolling time frame, but I think my suggested MAQL query should work unless you do not wish to segment by month/year. I believe there is misunderstanding from my side, I strongly recommend you to join our community slack channel and there are specialists timely replied any questions about GoodData products related general and technical questions with the following link: gooddataconnect.slack.com

Jacques

Hello is there a way to just see the last line (green highlight) of data rather than all of the historical data?
 

 

Userlevel 2

Hi @John , form your screenshot I assume in the second column you have plain SUM of revenue (revenue being in your data or somehow calculated as a simple metric) and the third column should be YTD of this revenue, correct?

 

Assuming you do not have any data for the future, this metric for the YTD should work:

 

SELECT
SUM(Revenue) BY Year (Date)
WHERE Year (Date) = THIS
WITH PF EXCEPT Date (Date)

 

This metric should give you YTD for current year and it will also work if you use a date filter to just filer out the current month. Here is how it works:

  • the SUM(Revenue) BY Year makes sure it sums the data for the whole year even if you split the result by month (technically it assigns the SUM value for whole year to each month of the year)
  • the WHERE Year = THIS filters out any data from different years
  • and the WITH PF EXCEPT Date (meaning “with parent filter except date”) disables the effect of any external date filter to this particular metric (just this metric, other metrics in the same insight like the standard SUM(Revenue) will still get filtered. So a potential insight-level or dashboard-level filter for “current month” will not be applied to this metric and that will ensure it will be calculated based on all the data (from the whole current year) no matter what date filter is used.

If you put this metric to an insight with plain SUM(Revenue), view it by month and filter the insight to current month, it should give you exactly what you have in your last row.

 

I hope this helps.