Another question Given a sparse fact dataset some values of GoodData #gooddata-platform

Another question: Given a sparse fact dataset (so...

Philippe Hébert

02/02/2022, 8:58 PM

Another question: Given a sparse fact dataset (some values of dimensions are missing) and a complete dimension dataset for said sparse dimension on the fact dataset, is there a way I can influence the SQL join order for the generation of an insight such that all values of the dimension are shown? Example: • Revenues per day over a month (show all days of the month) • Revenues per employee (show all employees)

Jakub Sterba

02/03/2022, 12:23 AM

The row with with a month or employee will show in case you have null in the fact, which is being aggregated (value of some other facts may exist, but NULL was loaded into this particular fact).

Jakub Sterba

02/03/2022, 12:33 AM

You can use following trick if record with null does not exist at all in a dataset. Revenue (all data)

Copy code

SELECT inull(Revenue,0)+0

Constant 0 (

select 0

) can be displayed for all values of single attribute (either Employee or Date, but not both in the same insight). The function `ifnull(Revenue,0)`returns either Revenue or 0 if data is missing.

Philippe Hébert

02/04/2022, 1:27 PM

So if I understand properly, you are proposing two options: 1. Append a set of NULL records that span all of the dimensions 2. Use the coalescing formula provided above to create a metric, which will result in the equivalent behaviour as if we had dimension LEFT JOIN fact Am I correct in my understanding?

Jakub Sterba

02/04/2022, 2:27 PM

yes, either load NULL values into fact datasets or use the proposed formula to force the query engine to perform the left join

✅ 1

Philippe Hébert

02/04/2022, 2:30 PM

Great thanks!

Philippe Hébert

02/04/2022, 2:33 PM

I guess the second solution is better than the first in that the default behaviour will be not to add dimension rows devoid of facts, and I can use a different metric if we want to include all dimensions.

Philippe Hébert

02/04/2022, 2:38 PM

Less chances of errors that way

Jakub Sterba

02/04/2022, 2:41 PM

I suggest to try rather the formula because filling the NULLs in sparse fact dataset may probably lead to data explosion if there are many dimensions.

👍 1

Philippe Hébert

02/04/2022, 2:52 PM

Agreed, that's also an important factor!

Open in Slack

Previous Next