I have an Orders entity and a Order Line Items entity I am t GoodData #gooddata-platform

I have an Orders entity and a Order Line Items ent...

Willie Doyle

08/28/2023, 4:35 PM

I have an Orders entity and a Order Line Items entity. I am trying to make a Metric to get the "Earliest" price from the line items table for each Order Line Item SKU. Meaning, If the customer ordered the same SKU lots of times, I want to get the price they paid on the first order of that item. I am trying SELECT MIN(Price) BY Order Supplier, Order Line SKU where RANK(Date (Order Sent At)) ASC WITHIN( Order Supplier, Order Line SKU ) = 1 . Can you tell me what I am doing wrong?

👀 1

Ivana Gasparekova

08/28/2023, 4:44 PM

Hi Willie, Happy to help here. May I know what is wrong with your current metric, please? Incorrect results, report not computable,…

Ivana Gasparekova

08/28/2023, 4:48 PM

I would probably get rid of the RANK function here and rather work with MIN(Date (Order Sent At)), if possible.

Willie Doyle

08/28/2023, 4:54 PM

The KPI validates saves, but it doesn't return any data

Willie Doyle

08/28/2023, 4:57 PM

I don't think I can use MIN(Order Date) because the first order for each SKU can be different. How would I use MIN(Order Date) exactly?

Willie Doyle

08/28/2023, 5:00 PM

For context, I am trying to build a metric that gets the first price a customer paid for an item and the latest price a customer paid for an item so I can subtract the two and see how prices are drifting/changing for that item. Meaning, are they paying more now than they were initially or less now than they were initially.

Ivana Gasparekova

08/28/2023, 5:31 PM

Oh, I see. Sorry for my misunderstanding here. I thought that we are speaking about the earliest purchase(by date). Also, may I know what type of Insight are you trying to build here, please?

Willie Doyle

08/28/2023, 8:33 PM

What do you mean by insight? Like a KPI?

Willie Doyle

08/28/2023, 8:34 PM

I am wanting a KPI that shows average price drift ... then drill into a report that shows the top items (by SKU) that have had the largest price drifts

Ivana Gasparekova

08/29/2023, 2:27 PM

Insights are like “reports” and they are more complex than KPIs. And I am sorry to say so, but it is not possible to drill to the report/Insights from the KPI. Would it be possible to choose one of the available Insight Types, please? Also, I’d like to highlight here that metric building is heavily dependant on your LDM(Logical Data Model). If you are not comfortable with sharing your model publicly, we can move to DMs instead.

Michal Hauzírek

08/29/2023, 7:06 PM

Hi Willie, I think you were not that far with your original metric. The following approach should work for your case, I believe. I simplified a bit and did not include “supplier” in my metrics, but you can add it if needed. First/last price let’s first define the metrics for first and last price of each SKU: (the only difference between these two is the ASC/DESC in the RANK). `first_sku_price`:

Copy code

SELECT MIN(price) WHERE (
 SELECT RANK( 
   (SELECT Date (Order Date) BY Records of Order Line Item) 
 ) ASC WITHIN (Order Line SKU, ALL OTHER) ) = 1

`last_sku_price`:

Copy code

SELECT MIN(price) WHERE (
 SELECT RANK( 
   (SELECT Date (Order Date) BY Records of Order Line Item) 
 ) DESC WITHIN (Order Line SKU, ALL OTHER) ) = 1

Differences from your original approach is: •

ALL OTHER

in the WITHIN clause (to lock the rank calculation only by SKU and nothing else, no matter what dimensionality the KPI/insight or higher-level metric uses • sub-metric with

(SELECT date BY primary key)

instead of just plain

date

in the RANK function. This is due to the fact that date dimension in GoodData is special and treated as a separate dataset which can be referenced from multiple other datasets (those where the date values actually are). It is a little counter-intuitive, but this is the way how to say “Use the values of date from this dataset”. (and you need to do it even if there is currently only one dataset connected to it). Calculating the average price drift Then with these two metrics prepared, you can use them in another metric:

Copy code

SELECT AVG (
  ( SELECT last_sku_price - first_sku_price BY Order Line SKU, ALL OTHER )
)

Here we are calculating average of the difference of those two metrics. The expression

BY Order Line SKU, ALL OTHER

again forces the difference to be calculated on the SKU level (no matter what the report/kpi/metric context is). On top of that we are applying AVG so this metric will also work as a KPI, giving you the average price drift across all SKUs (unweighted). You can use this same metric both for your KPI (or headline insight to enable drill on it) and for your detailed report/insight viewed by SKU. Attached please see the results on some sample data. I hope this helps.

Willie Doyle

08/29/2023, 7:47 PM

Thank you! Quick question ... do I literally put the text "BY Records of Order Line Item" in the calculation itself? Or were you trying to tell me I need to replace that text with something?

Michal Hauzírek

08/29/2023, 7:54 PM

Oh, good question, sorry, I forgot to explain this one… you can not just type in this text, you type in just the

BY

and then you need to select an attribute (representing the dataset) from the list. It is an attribute which is in the data model a primary key of your

Order Line Item

dataset. That is either: • If you have single-attribute primary key (i.e.

Order Line Item ID

) you select that one. • Or if you do not have single-attribute primary key (either have compound primary key from multiple attributes/references or no primary key at all) GoodData automatically created a pseudo-attribute called

Records of (name of dataset)

for you. And you select that one (it will be in the list).

Willie Doyle

08/30/2023, 12:49 AM

I got it working! But I have an additional clarification. I have been using the BY keyword a lot in my calculations, but all of my other calculations don't use the ", ALL OTHER" add-on. I am trying to figure out if I should add it to my other similar calculations. To clarify, does the ", ALL OTHER" keyword just protect the calculation from other attributes of the same dimension (for instance, other attributes in the SKU table) from affecting the calculation? It doesn't change the calculation itself, and it doesn't have any affect on attributes from other dimensions being in the same insight. Its just to make sure the calculation doesn't break if other attributes of the same dimension are added to the insight. Did I get that right?

Willie Doyle

08/30/2023, 1:04 AM

And If I actually wanted the price drift to change/update based on other attributes of product that might included in the insight from the product dimension, I should remove the ", ALL OTHER" ? And also to clarify, since we didn't put the ", ALL OTHER" on the order date portion, I am guessing the picking the first and last price paid will adjust adjust based on the date filter applied to the report?

Willie Doyle

08/30/2023, 2:31 AM

One more question I just discovered. When I pull this metric into an analysis report, the date attribute disappears because it thinks it incompatible. Do you know why this is happening?

Michal Hauzírek

08/30/2023, 10:32 AM

Good to hear it worked for you 👍 Let me try to answer your questions: BY vs BY ALL OTHER Normally the metrics working with the context of the insight/report - that is what attributes are used to slice the data. If you add some attribute to the report, the system is trying to calculate the metrics also by it. So if you have for example

SELECT SUM(x) BY Customer

and use this in an insight with

Customer

and

Product

attributes in it, it automatically calculates it as if it was

BY Customer, Product

(only if this combination is compatible from the data model perspective). If you have

SELECT SUM(x) BY Customer, ALL OTHER

it will behave differently - if you view this metric by

Customer

and

Product

, it will only break it by Customer and it will repeat the same values (calculated by each

Customer

) for different values of

Product

for that particular customer. And it does this not just for LDM-compatible attributes, but for any attribute. So if the metric has BY ALL OTHER in it, it can be used also with unrelated attributes from other part of the model (and you get the total number repeated for all their values). So I would recommend only using the ALL OTHER when you want to exactly lock the dimension calculation of a metric and make sure it does not change based on which attributes the metric is used with. For sure not in every metric “just in case”. The same is also for the simple BY - use only when needed. removing ALL OTHER in this case to allow slicing by other attributes So answer to

And If I actually wanted the price drift to change/update based on other attributes of product that might included in the insight from the product dimension, I should remove the “, ALL OTHER” ?

is YES. Same as if you want to see it by date (i.e. difference of first and last SKU price within each month) - see the screenshot, here I removed the ALL OTHER and am now able to view it by month. How filters work with all this Yes, report and dashboard filters (including date filters) are still applied to the calculation (actually to all its parts). The

BY

and

BY ALL OTHER

have no impact to the filters actually (similarly how in SQL the WHERE section still applies to GROUP BY). If you want to affect what filters are or are not applied to a metric, you can use another concept for overriding parent filters: •

WITHOUT PARENT FILTER

(or

WITHOUT PF

for short) makes the metric ignore any filter •

WITH PF EXCEPT …

makes it ignore only the explicitly named ones •

WITHOUT PF EXCEPT …

ignores all filters but the explicitly named ones And you can combine this with

BY / BY ALL

So for example a metric

SELECT MAX(price) BY ALL OTHER WITHOUT PF

gives you a global maximal price (across all SKUs, all orders, all suppliers, all time - no matter what filters you use and with what attributes in the report). And it will give you the same number for any value of any attribute…

Willie Doyle

08/30/2023, 2:48 PM

Wow... what an excellent explanation! You should be writing how-to books! 🙂 I actually feel like I totally understand this now (even though its complicated to think through the implications for each individual metric I am building) Can you answer the question of why the Date disappears as an option to include on the Analysis when I pull the new price drift calculation on to the metrics?

Willie Doyle

08/30/2023, 8:03 PM

In addition to the question above, I keep getting these date warnings when I am building reports. I don't know what I can do differently to make them go away. I built a metric for Average Order Count By Location and Month. Every time I try to pull a date filter on the report that includes Order Sent Date it gives me this warning but it still lets me save the report and it looks like it is working as expected. Can you help?

Michal Hauzírek

08/30/2023, 9:09 PM

Thanks. I actually did write a few how-to articles 🙂 Great to hear you now feel you understand it. I know it can be challenging and complex, especially without seeing “inside”. What can help is to use the “Explain resource” to show you how a particular report/insight is calculated (and potentially also show the SQL behind it). The SQL there might be a bit difficult to read, given the table and column names are generated and the physical model does not exactly match the logical one, but it can help sometimes. I quite like the “optimized query tree” (another type of explain resource output) which displays the logic of the calculation as a tree - at the top are the metrics with dimensionality of the report (attributes in the report) and at the very bottom are raw facts and attributes. And in between, there are the operations/nodes. Each node also shows its dimensionality (in square brackets) on which the operation is applied. So you can for example tell that it first applies the calculation of price and its difference on SKU+month level and then applies AVG just on the month level (in case I use the drift metric in a report just with month and not SKU). The output is a bit technical and rough, the attributes and facts are represented by their IDs (and you need to point the mouse to it to see the human readable title). Also it can be a bit overwhelming for complex metrics, but it is IMO good for debugging. What can help to utilize it more is (if you are using Chrome browser) to use the GoodData Extension for Chrome. With that, you can easily get to the explain resource from the insight with a few clicks and apart from that, it has many other useful features for developers working with GoodData Platform. To be honest I am not sure why the date is disappearing. It would mean that the system does not consider this metric sliceable by the date. I don’t know how exactly your data model looks like and how the date is connected to the other parts. In my simplified example I have the date connected to Order (see the screenshot) and with the modified drift metric (without those ALL OTHER), it is letting me do the drift by date. Maybe your date is connected somehow differently? Maybe to the Order Line Items?

Michal Hauzírek

08/30/2023, 9:29 PM

For the filter warnings - I believe this is happening when a metric is not sliceable by that attribute/date (but still can be filterable by it). And as far as I know if that is really the case (usually with quite complex metrics), there is no way how to avoid them unfortunately. I have one idea for this particular case though. I am wondering if in your case it could be caused by the single parametric

COUNT (Order ID)

. I assume in your data model (similarly to mine), the

Order ID

is a primary key in

Orders

and foreign key in

Order Line Items

, right? Well for GoodData these are just two places where the attribute

Order ID

can be counted (in

Orders

and in

Order Line Items

) with potentially different results and different granularity of the result. If the function COUNT in GoodData is used with just one parameter, it tries to be “smart” and guess which of these you want to actually count based on the context. I am not sure if the filter warning works with this guessing and full context. Especially if you have the date connected to line items, it would mean that one of these counts would be filterable by the date and the other would not (from the GoodData model perspective). Personally I usually do not want my COUNT metric to switch where they are counted depending on how they are used, so I prefer the two-parametric COUNT (where the second parameter is primary key attribute which explicitly defines the dataset where to count). I am not sure if in your case this matters result-wise (i.e. if there are any orders without line items) but in some cases it can be significantly different (i.e. count of existing teams vs count of teams with assigned task). So I would recommend trying to switch to two-parametric count and see if the warning is still there.

Willie Doyle

08/31/2023, 5:07 PM

Thank you for the explanation and the troubleshooting tips! Through trial and error, I figured out that any time you put the ", ALL OTHER" in a calculation, it essentially prohibits you from adding filters and other dimensional items to insights. So the ", ALL OTHER" clause has some pretty big tradeoffs when trying to use it in reports (especially reports with dates which most of mine are).

Michal Hauzírek

08/31/2023, 7:55 PM

Hmm, that should not be happening with ALL OTHER as far as I know. Could you please share some example of a metric that does not work with some filter, including screenshot of the relevant part of the model?

3 Views

Open in Slack

Previous Next