How-To: Display Null or Missing Values

  • 3 July 2020
  • 1 reply
  • 922 views

Userlevel 1

How the Platform Works

The GoodData platform is designed to perform inner joins between datasets for metric/measure calculation. Here’s an example of what this looks like:

 

Say we have the following datasets and fields in the LDM:

Z8DH7K1yAIs94_VFwpBu47m_TAgiO9KiB-5g5ndXMrqOJIJ7x0jjl0K4oi0uZQKkM4Be9b3hPIeBXwtsPZVBRfUMAQj6TpFOsm8lQnoqvdam3xTAKW9BAT75pZ7MMzLn3ijd0CQWs9ReZQ5h2y7gxA

Test Dimension Data Values:

Test Score Data Values:

If we wanted to view all SUM(Score) by Test Type, only Test Types with scores will appear:

If you wanted to view all Test Types, regardless if they have an associated record in the Test Score table, there are two ways to display these values: apply a metric definition change for a single dimension you want to join against or generate the missing rows in the ETL.

 

Option 1: Metric change for a single dimension

If there is a single dimension you would like to compare the measure against (i.e. Test Dimension), you could modify the measure to simulate a LEFT OUTER JOIN from that dimension to the fact. 

 

Previous Measure:

SELECT SUM(Score)

 

Modified Measure:

SELECT IFNULL(SUM(Score),0)  + 0

 

q3Rkf_EwthLDaj7UlBaf1I9_AXAblKqy0nwCkhL5qXeYaJSAlZo583VLiAMgXTo6llGvMP-Cu6lbC3N23XY24JUXLWG9UmxKo3up3TAn3elt3DO79p9R063bKVImNGPkMlAk0n4ZcBPI4fHUaBa_-g

The missing records will be populated with a non-null measure value. 

 

Option 2: Generating missing rows in ETL

 

If there are multiple dimensions you would like to compare the measure against in the same report/insight, you have to populate the missing records in the ETL. The cartesian product of all of the dimensions you would like to slice and dice by will need to be loaded into the fact table. New data volume in the fact table with this missing row generation will increase based on the number of attributes you would like to slice and dice by, the cardinality of each of these attributes, and the original data volume of the fact table. 

 

The updated Test Score fact table in our example will be:

Using the original measure of SUM(Score), the report of this measure by Test Type would be the following:

 

Test Type

SUM(Score)

System

180

Regression

-

 

As another example, let’s add another dimension to the LDM:

QqAj34gUt8r1NiagoGxjfVwUU_j91Znx_sJJAo_7S1-NRp1pGKfXrhU11ndnIM-Wu_Ml6fMY2Aihj7U7E0TMXR8NaHnbXN9kkXCA7PfigqErx7FEq4hTdvQ6fliEJd18uRVPEaEff0YOzNBep3wfBQ

 

Say the Test Score data values are as follows:

You would like to report on every combination of Test Type and Test Taken Month/Year Date for Q1 of 2019 (Jan, Feb, Mar 2019). Because there are multiple dimensions you would like to compare the scores against (Test Type and Test Taken Date), you will need to populate the missing data in the backend. The updated Test Score data values will now be:

Now, the report of SUM(Score) by Test Type, Test Taken Date where Test Taken Date = Q1 2019 would be the following:

Test Type

Test Taken Date

SUM(Score)

System

1/1/2019

80

System

2/1/2019

100

System

3/1/2019

-

Regression

1/1/2019

-

Regression

2/1/2019

-

Regression

3/1/2019

-


1 reply

Can you reload the broken image links?

Reply