Solved

RUNSUM doesn't seem to like intermittent data

  • 19 January 2023
  • 5 replies
  • 104 views

GoodData Enterprise;

Metrics in play include:

# Incidents_Recordable: SELECT IFNULL(COUNT(incidentID),0) WHERE isRecordable=1 AND Date (Event) BETWEEN _Timeline_Start AND _Timeline_End

YTD Recordables: SELECT IFNULL(RUNSUM(# Incidents_Recordable), 0) BY Month (Event) WHERE Date (Event) BETWEEN _Timeline_Start AND _Timeline_End

 

Here is the data table where the issue manifests itself:

The highligted elements in the data table show the YTD count dropping to zero, rather than preserving the YTD from the previous period. Cleveland, for example, should remain a 2 through March and April, as the YTD count was 2 at the end fo February. I’ll note that if Cleveland had an incident in May (outside the range of the image), the YTD count would (and does) correctly reflect the value of 3. Analysis shows that these “gaps” occur whenever a month has no incidents (and there is at least 1 incident in a predecessor month). 

It is clear that the RUNSUM function works correctly when there is an essentially continuous time series (e.g, see YTD hours), but seems to break down if the data stream is iintermittent.

Any thoughts and suggestions would be greatly appreciated.

icon

Best answer by Joseph Heun 19 January 2023, 19:57

View original

5 replies

Is there a more official way around this? 

I’m running into the same problem where I’m trying to calculate a runsum on expenses across multiple projects and plot this by month and week in two different graphs.  Not every project has consistent expenses being logged and while I tried to create a metric that creates a zero value it is not working correctly. Basically there are gaps wherever there are no expenses logged and it is unreasonable to attempt to manually enter $0 expenses across a large number of projects and time. 

This seems like a major flaw in the runsum function. 

Issue solved! I found a way to do this without passing a zero value. It’s a three-step process (some of which could probably be combined):

  1. Use SELECT to aggregate records (assume my metric is called RecordCount)
  2. Create an IF-THEN-ELSE metric (SELECT IF RecordCount > 0 THEN RecordCount ELSE 0 END; assume my metric is called IfThen)
  3. Create a RUNSUM metric (SELECT RUNSUM(IfThen))

In the following chart, notice how, in the first row, the hours increased between Feb and March but YTD Rec Count did not. Previously, because there were no incidents in March, I would get a dash and the YTD TRIR would not calculate. Now I get values!

 

Userlevel 3

Yes that is correct. You will need to pass a zero value in the source data if it has not been created.

Hi, Joseph,

 

We are counting injuries to employees - naturally we prefer not to have any. We normally pass data surrounding actual incidents. It sounds like you are saying that if we don’t have any incidents in a given month, we need to pass a zero value for that month. Am I understanding your statement correctly?

Userlevel 3

Hi Jay, 

This looks to be related to the missing source data. If there is no record of the occurrence in the source data, IFNULL doesn’t create the missing records. The function itself inputs a value but not if the date doesn’t exist. I believe you will need to create the date and null values within the source data in order to achieve your desired results with the RUNSUM function. 

 

If this isn’t the case, then the next thing would be to check all inner metrics used and make sure they are also using the IFNULL function. 

Reply