I have an Orders entity and a Order Line Items ent...
# gooddata-platform
w
I have an Orders entity and a Order Line Items entity. I am trying to make a Metric to get the "Earliest" price from the line items table for each Order Line Item SKU. Meaning, If the customer ordered the same SKU lots of times, I want to get the price they paid on the first order of that item. I am trying SELECT MIN(Price) BY Order Supplier, Order Line SKU where RANK(Date (Order Sent At)) ASC WITHIN( Order Supplier, Order Line SKU ) = 1 . Can you tell me what I am doing wrong?
đź‘€ 1
i
Hi Willie, Happy to help here. May I know what is wrong with your current metric, please? Incorrect results, report not computable,…
I would probably get rid of the RANK function here and rather work with MIN(Date (Order Sent At)), if possible.
w
The KPI validates saves, but it doesn't return any data
I don't think I can use MIN(Order Date) because the first order for each SKU can be different. How would I use MIN(Order Date) exactly?
For context, I am trying to build a metric that gets the first price a customer paid for an item and the latest price a customer paid for an item so I can subtract the two and see how prices are drifting/changing for that item. Meaning, are they paying more now than they were initially or less now than they were initially.
i
Oh, I see. Sorry for my misunderstanding here. I thought that we are speaking about the earliest purchase(by date). Also, may I know what type of Insight are you trying to build here, please?
w
What do you mean by insight? Like a KPI?
I am wanting a KPI that shows average price drift ... then drill into a report that shows the top items (by SKU) that have had the largest price drifts
i
Insights are like “reports” and they are more complex than KPIs. And I am sorry to say so, but it is not possible to drill to the report/Insights from the KPI. Would it be possible to choose one of the available Insight Types, please? Also, I’d like to highlight here that metric building is heavily dependant on your LDM(Logical Data Model). If you are not comfortable with sharing your model publicly, we can move to DMs instead.
m
Hi Willie, I think you were not that far with your original metric. The following approach should work for your case, I believe. I simplified a bit and did not include “supplier” in my metrics, but you can add it if needed. First/last price let’s first define the metrics for first and last price of each SKU: (the only difference between these two is the ASC/DESC in the RANK). `first_sku_price`:
Copy code
SELECT MIN(price) WHERE (
 SELECT RANK( 
   (SELECT Date (Order Date) BY Records of Order Line Item) 
 ) ASC WITHIN (Order Line SKU, ALL OTHER) ) = 1
`last_sku_price`:
Copy code
SELECT MIN(price) WHERE (
 SELECT RANK( 
   (SELECT Date (Order Date) BY Records of Order Line Item) 
 ) DESC WITHIN (Order Line SKU, ALL OTHER) ) = 1
Differences from your original approach is: •
ALL OTHER
in the WITHIN clause (to lock the rank calculation only by SKU and nothing else, no matter what dimensionality the KPI/insight or higher-level metric uses • sub-metric with
(SELECT date BY primary key)
instead of just plain
date
in the RANK function. This is due to the fact that date dimension in GoodData is special and treated as a separate dataset which can be referenced from multiple other datasets (those where the date values actually are). It is a little counter-intuitive, but this is the way how to say “Use the values of date from this dataset”. (and you need to do it even if there is currently only one dataset connected to it). Calculating the average price drift Then with these two metrics prepared, you can use them in another metric:
Copy code
SELECT AVG (
  ( SELECT last_sku_price - first_sku_price BY Order Line SKU, ALL OTHER )
)
Here we are calculating average of the difference of those two metrics. The expression
BY Order Line SKU, ALL OTHER
again forces the difference to be calculated on the SKU level (no matter what the report/kpi/metric context is). On top of that we are applying AVG so this metric will also work as a KPI, giving you the average price drift across all SKUs (unweighted). You can use this same metric both for your KPI (or headline insight to enable drill on it) and for your detailed report/insight viewed by SKU. Attached please see the results on some sample data. I hope this helps.
w
Thank you! Quick question ... do I literally put the text "BY Records of Order Line Item" in the calculation itself? Or were you trying to tell me I need to replace that text with something?
m
Oh, good question, sorry, I forgot to explain this one… you can not just type in this text, you type in just the
BY
and then you need to select an attribute (representing the dataset) from the list. It is an attribute which is in the data model a primary key of your
Order Line Item
dataset. That is either: • If you have single-attribute primary key (i.e.
Order Line Item ID
) you select that one. • Or if you do not have single-attribute primary key (either have compound primary key from multiple attributes/references or no primary key at all) GoodData automatically created a pseudo-attribute called
Records of (name of dataset)
for you. And you select that one (it will be in the list).
w
I got it working! But I have an additional clarification. I have been using the BY keyword a lot in my calculations, but all of my other calculations don't use the ", ALL OTHER" add-on. I am trying to figure out if I should add it to my other similar calculations. To clarify, does the ", ALL OTHER" keyword just protect the calculation from other attributes of the same dimension (for instance, other attributes in the SKU table) from affecting the calculation? It doesn't change the calculation itself, and it doesn't have any affect on attributes from other dimensions being in the same insight. Its just to make sure the calculation doesn't break if other attributes of the same dimension are added to the insight. Did I get that right?
And If I actually wanted the price drift to change/update based on other attributes of product that might included in the insight from the product dimension, I should remove the ", ALL OTHER" ? And also to clarify, since we didn't put the ", ALL OTHER" on the order date portion, I am guessing the picking the first and last price paid will adjust adjust based on the date filter applied to the report?
One more question I just discovered. When I pull this metric into an analysis report, the date attribute disappears because it thinks it incompatible. Do you know why this is happening?
m
Good to hear it worked for you đź‘Ť Let me try to answer your questions: BY vs BY ALL OTHER Normally the metrics working with the context of the insight/report - that is what attributes are used to slice the data. If you add some attribute to the report, the system is trying to calculate the metrics also by it. So if you have for example
SELECT SUM(x) BY Customer
and use this in an insight with
Customer
and
Product
attributes in it, it automatically calculates it as if it was
BY Customer, Product
(only if this combination is compatible from the data model perspective). If you have
SELECT SUM(x) BY Customer, ALL OTHER
it will behave differently - if you view this metric by
Customer
and
Product
, it will only break it by Customer and it will repeat the same values (calculated by each
Customer
) for different values of
Product
for that particular customer. And it does this not just for LDM-compatible attributes, but for any attribute. So if the metric has BY ALL OTHER in it, it can be used also with unrelated attributes from other part of the model (and you get the total number repeated for all their values). So I would recommend only using the ALL OTHER when you want to exactly lock the dimension calculation of a metric and make sure it does not change based on which attributes the metric is used with. For sure not in every metric “just in case”. The same is also for the simple BY - use only when needed. removing ALL OTHER in this case to allow slicing by other attributes So answer to
And If I actually wanted the price drift to change/update based on other attributes of product that might included in the insight from the product dimension, I should remove the “, ALL OTHER” ?
is YES. Same as if you want to see it by date (i.e. difference of first and last SKU price within each month) - see the screenshot, here I removed the ALL OTHER and am now able to view it by month. How filters work with all this Yes, report and dashboard filters (including date filters) are still applied to the calculation (actually to all its parts). The
BY
and
BY ALL OTHER
have no impact to the filters actually (similarly how in SQL the WHERE section still applies to GROUP BY). If you want to affect what filters are or are not applied to a metric, you can use another concept for overriding parent filters: •
WITHOUT PARENT FILTER
(or
WITHOUT PF
for short) makes the metric ignore any filter •
WITH PF EXCEPT …
makes it ignore only the explicitly named ones •
WITHOUT PF EXCEPT …
ignores all filters but the explicitly named ones And you can combine this with
BY / BY ALL
So for example a metric
SELECT MAX(price) BY ALL OTHER WITHOUT PF
gives you a global maximal price (across all SKUs, all orders, all suppliers, all time - no matter what filters you use and with what attributes in the report). And it will give you the same number for any value of any attribute…
w
Wow... what an excellent explanation! You should be writing how-to books! 🙂 I actually feel like I totally understand this now (even though its complicated to think through the implications for each individual metric I am building) Can you answer the question of why the Date disappears as an option to include on the Analysis when I pull the new price drift calculation on to the metrics?
In addition to the question above, I keep getting these date warnings when I am building reports. I don't know what I can do differently to make them go away. I built a metric for Average Order Count By Location and Month. Every time I try to pull a date filter on the report that includes Order Sent Date it gives me this warning but it still lets me save the report and it looks like it is working as expected. Can you help?
m
Thanks. I actually did write a few how-to articles 🙂 Great to hear you now feel you understand it. I know it can be challenging and complex, especially without seeing “inside”. What can help is to use the “Explain resource” to show you how a particular report/insight is calculated (and potentially also show the SQL behind it). The SQL there might be a bit difficult to read, given the table and column names are generated and the physical model does not exactly match the logical one, but it can help sometimes. I quite like the “optimized query tree” (another type of explain resource output) which displays the logic of the calculation as a tree - at the top are the metrics with dimensionality of the report (attributes in the report) and at the very bottom are raw facts and attributes. And in between, there are the operations/nodes. Each node also shows its dimensionality (in square brackets) on which the operation is applied. So you can for example tell that it first applies the calculation of price and its difference on SKU+month level and then applies AVG just on the month level (in case I use the drift metric in a report just with month and not SKU). The output is a bit technical and rough, the attributes and facts are represented by their IDs (and you need to point the mouse to it to see the human readable title). Also it can be a bit overwhelming for complex metrics, but it is IMO good for debugging. What can help to utilize it more is (if you are using Chrome browser) to use the GoodData Extension for Chrome. With that, you can easily get to the explain resource from the insight with a few clicks and apart from that, it has many other useful features for developers working with GoodData Platform. To be honest I am not sure why the date is disappearing. It would mean that the system does not consider this metric sliceable by the date. I don’t know how exactly your data model looks like and how the date is connected to the other parts. In my simplified example I have the date connected to Order (see the screenshot) and with the modified drift metric (without those ALL OTHER), it is letting me do the drift by date. Maybe your date is connected somehow differently? Maybe to the Order Line Items?
For the filter warnings - I believe this is happening when a metric is not sliceable by that attribute/date (but still can be filterable by it). And as far as I know if that is really the case (usually with quite complex metrics), there is no way how to avoid them unfortunately. I have one idea for this particular case though. I am wondering if in your case it could be caused by the single parametric
COUNT (Order ID)
. I assume in your data model (similarly to mine), the
Order ID
is a primary key in
Orders
and foreign key in
Order Line Items
, right? Well for GoodData these are just two places where the attribute
Order ID
can be counted (in
Orders
and in
Order Line Items
) with potentially different results and different granularity of the result. If the function COUNT in GoodData is used with just one parameter, it tries to be “smart” and guess which of these you want to actually count based on the context. I am not sure if the filter warning works with this guessing and full context. Especially if you have the date connected to line items, it would mean that one of these counts would be filterable by the date and the other would not (from the GoodData model perspective). Personally I usually do not want my COUNT metric to switch where they are counted depending on how they are used, so I prefer the two-parametric COUNT (where the second parameter is primary key attribute which explicitly defines the dataset where to count). I am not sure if in your case this matters result-wise (i.e. if there are any orders without line items) but in some cases it can be significantly different (i.e. count of existing teams vs count of teams with assigned task). So I would recommend trying to switch to two-parametric count and see if the warning is still there.
w
Thank you for the explanation and the troubleshooting tips! Through trial and error, I figured out that any time you put the ", ALL OTHER" in a calculation, it essentially prohibits you from adding filters and other dimensional items to insights. So the ", ALL OTHER" clause has some pretty big tradeoffs when trying to use it in reports (especially reports with dates which most of mine are).
m
Hmm, that should not be happening with ALL OTHER as far as I know. Could you please share some example of a metric that does not work with some filter, including screenshot of the relevant part of the model?