Hey channel I am trying to figure out is there a simpler way GoodData #gd-beginners

Hey channel, I am trying to figure out, is there ...

Alex Sher

10/12/2022, 6:34 PM

Hey channel, I am trying to figure out, is there a simpler way to achieve my desired result: • I have several products (2-3) • I have a table with

views

(one record per view, like

view_id

event_date

product

) • I have a table with

conversions

(one record per conversion action, like

conversion_id

event_date

product

) • I have a logical model in GD

event_date

views

conversions

• I want to build a chart / table with following result:

event_date

product

conversion_ratio (count of conversions for current date and product divide by count of views for current date and product

If I will use SQL query, it will look like this (with a bit of simplifications):

Copy code

with daily_views as (
  select event_date, product, count(1) as views_count
  from views
  group by 1, 2
),
daily_conversions as (
  select event_date, product, count(1) as conv_count 
  from conversions
  group by 1, 2
),
joined as (
  select *
  from daily_views v
  left join daily_conversions c on c.event_date = v.event_date AND c.product = v.product
)
select
  event_date,
  product,
  conv_count / views_count
from joined

The crucial part here is the final join on 2 column key (date + product). As far as I understand, GD could not do such joins:

• Tip 4: If you need to join/reference two datasets using a multi-column key (which is not possible in GoodData), generate a separate single attribute which will contain concatenated values of all the columns used in the key. For example, if you plan to join by companyID + orderID, create an attribute which combines these two.

I could follow this tip and prepare my data in SQL in my ETL process, and ship to GD final result. But I am wondering, is there a way to achieve the same result in GD, using models over

views

and

clicks

not aggregated tables (or at least using pre-aggregated

daily_views/conversions

tables)?

Jakub Sterba

10/13/2022, 4:59 AM

Hi @Alex Sher, GoodData implements drill across feature so you do not need to care about complexity of joining data coming from different fact datasets if they use the same dimensions. For example full outer join is performed if you place metric counting number of views and metric showing number of conversions into the same table with dimensions date and product. For arithmetic operation such as calculation of conversion ratio (

select count(conversion_id)/count(view_id) )

will perform inner join between count results coming from conversions dataset and count results coming from views dataset. I think inner join is enough for your use case, but in case you want to include also 0 for products which had a views on given date but no conversions you can achieve right join by using ifnull function:

select ifnull(count(conversion_id),0) / count(view_id)

this will result in using right join. Use of ifnull for both operands would result in full outer join between the operands but that probably does not make sense due to division by zero in the metric. You do not need to do anything special to achieve this in the data model. Just use 1:N relations between product dataset and date dimension dataset to both tables (fact datasets shall be connected to the shared dimensions with arrow pointing towards fact datasets)..

Jakub Sterba

10/13/2022, 5:08 AM

looking again at your question you probably need to refactor your model a bit. You need separate dimension dataset

products

which will contain primary key “`product`” and you have to connect it to both fact datasets. Columns

product

in these fact datasets will be foreign keys. The query engine will then understand that

product

is a conformed dimension.

Jakub Sterba

10/13/2022, 5:58 AM

Model like this should work

Jakub Sterba

10/13/2022, 6:03 AM

But in following model

Product

Conversions

dataset is different attribute from

Product

Views

dataset. You will see two Product attributes in Analyze but you will be able to use always only one of them in insight.

Alex Sher

10/13/2022, 2:05 PM

Hey @Jakub Sterba, thanks a lot for such detailed explanation! I am a little shocked of detail level of help provided by you, this is a huge sale point for GD 🙂 After your explanation, I got that

You need separate dimension dataset
products
which will contain primary key “`product`” and you have to connect it to both fact datasets. Columns
product
in these fact datasets will be foreign keys. The query engine will then understand that
product
is a conformed dimension. (edited)

this is exactly missing piece in my model. Thanks again for pointing this out, I will try to build model this way and ask more questions, if I have some troubles with that approach. Looks pretty straightforward for me 🙂

Open in Slack

Previous Next