I don't know how this is occurring, sometimes, thi...
# gooddata-cloud
d
I don't know how this is occurring, sometimes, this row does not give the right answer. The equation is simple Dismissed % = Total Dismissed / Total Shown Where Dismissed = each row where shown = 1 and accepted = 0 We also get the same error with Accepted = each row where accepted = 1 and shown = 1 For some reason the equation usually works but for some rows like the one shown you can see that shown =2, dismissed =1, accepted = 1 But both dismissed % and accepted % are both 100%. If I change the group by on this, it gives other % for each one, but dismissed % + accepted % > 100% The two images are the same data on the same date but adding additional dimensions. You'll notice when I add more dimensions shown goes from 9 to 11 - which explains the above than 100%. It seems its not adding all the showns, but not sure why
m
Hi Doug, in order to understand better could you please provide us the MAQL query to see how did you create the metric/column 'Dismissed' ?
d
I'll send you the link. I did it three ways. I event used the UI once and divded and got the same anwer
Dismissed 1st way SELECT COUNT(table) where accepted = 1 remindershown = 1 2nd way SELECT SUM(CASE WHEN accepted = 1 and remindershown = 1 THEN 1 ELSE 0 END) Shown 2 ways Sum shown (this is in UI) SELECT COUNT(table) where remindershown = 1 Dismissed % IN UI - using Division In mAQL SELECT Dismissed/remindershown (used many combos)
m
Hi Doug, I recreated the table and I tried with the UI in mAQL and I get the 50%, is that what do you need?
d
Mhh, it must not be completly recreated. I also have it in the UI and I'm getting 100%
m
Hi Doug, After an exhaustive revision and testing with help of colleagues, we nailed the issue. The problem is related with the IDs, the IDs defined as primary key has to be unique as recommend in here https://www.gooddata.com/docs/cloud/model-data/concepts/primary-key/ : 'Make sure that each value of a primary key is unique. If the data contains multiple rows with identical primary key values, you will experience wrong results in visualizations.' Thus, if the id is repeated N times in the model, the value eventually will be multiplied N times as your issue. The corresponding id appers twice in the model, meaning your 50% will be 100%. I would recommend ensure all yours primary keys are unique values in the whole model. I will provide more information internally. Cheers, Mauricio.
d
Reference tables do not need to be unique (that would not make sense for table that are 1:many)
m
Hi Doug, I believe the confusion here is that Attribute/Primary Key is not the same as attribute value. What my colleague pointed out previously, is that for the attribute "Case Interaction Id" there are some values in it that are showing multiple times, and they coincide with the rows that are showing the discrepancies (for example "9 to 11" in the deleted screenshot) this is not considered good practice because the attribute itself has been designated as a primary key (this is also mentioned in the documentation sent previously) After checking your data model, I can see the same attribute is marked as a Primary Key on multiple datasets, this shouldn't be an issue since from my checking I believe the ID's change based on the context of the dataset, but the uniqueness is not ensure on the values of all of them as discussed with my colleague privately. With this said, you are right when stating that reference tables do not need to be unique, but note that in 1:N relationships, one record in the “one” table can relate to multiple records in the “many” table, but each record in the “many” table should relate to only one record in the “one” table. The bottom line is that The primary key in the “1” table is unique, this primary key is referenced as a foreign key in the “N” table (like department_id in employees), but it does not need to be unique there. Many employees can share the same department_id, creating the one-to-many relationship.
d
@Moises Morales you are describing exactly our set up. There is multiple Topics per case. Each topic only points to 1 case. there are 0 duplicates in the cases table. So when you keep saying there are duplicates I have no idea what your talking about because the Topics table many topics point to the same case interaction (so yes in topics there will be duplicates of that primary key but by definition of 1:many thats by design). on the cases table as you mentioned at the bottom of the paragraph. There is no duplication in the cases table. I feel like I'm saying the same thing over and over again and not getting anywhere. I don't understand the issue from your explanation as you said its both makes sense and wrong and I don't know how to fix it as our set up between Topics(many):Cases(1) makes sense and I already confirmed in the cases table there are 0 duplications of case_interaction_id If possible can we jump on a 15 min call.
m
The issue is here for example with this value. It's repeating 6 times in the attribute case_interaction_id. Could you check the attribute in the dataset "call topic" in your database and look for the value in the screenshot? If it's repeating, you will need to remove the extra values. If this is the desired setup, then you will need to design a different attribute as the primary key since the values must be unique.
d
Yes Case Interaction is unique to cases. Not Topic. I already know that ( no tables or proof needed) the field Case_interaction_ID in topic can be there 6 time because there can be 6 topics in 1 case. Even if we did add a new TOPIC ID and set that as the primary key (which is what I think your saying). When you join in the LDM you can only join on the Primary Key. The only Key that both tables have in common is the case_interaction_id so I would need to set that up as the primary key anyways
The desired set up is to have a 1:many table join where and they add up correctly. Case Topic Cases case_intearction_id (foreign Key) case_inteaction_id (this i the primary key for the table) I want to join and things add up correctly
I could switch the arrow which I believe would allow me not to worry about setting the primary key for the topic table and get around this problem but then Id be slicing the filter the facts from call topic by the attributes in call cases which is not what i want.
So this simple join can not be done in your LDM the way our data is split. We have three tables in this example Cases All information about the case -> This table has attributes and facts about the case Call Cases Some cases involve calls and if there is a call there is data ->This table has attributes and facts about the case These join 1:1 (however if there is no call there is no data) Call Topic This is the different topics that were on the case ->This table and attributes and facts that are about the case Most of our attributes are on the cases table which is why we have the arrow the way we have it. As I realized that these connecting arrows do not work like JOINs in SQL They are much more limited. I think the real problem is not our case_interaction_id's have multiple rows in a many to one join. Is that most of our attributes are on the cases table so we need to arrow pointing the way we point but that also makes us set a primary key where we don't want to. I think by the restrictions of the platform and our model we have to do an overhaul of how our data comes in. Maybe have a cases_attribute table that Points to a Topic Fact table and a cases fact table and have the arrow pointing the other way? I'm very unsure of the right solution here. From this back and forth I think that our LDM needs to change but in a way that won't mess up our current dashboards too much.
To respond to your Remove repeating values. There can be Multiple topics per a case. I can't remove the repeating values as thats the MANY to 1.
m
Unfortunately I am not familiar with the business use case of your model, I am only able to point out where the issue is and provide suggestions how to fix it, but ultimately, your model redo will be completely up to you, although we can also offer the assistance of our Professional services for this (please let me know if you are interested). I understand there can be multiple topics per case, but the points still stands, since the values in your primary key are not unique, this is creating the discrepancies in your reporting. To illustrate this, please consider the following tables: department_id | department_name 1 HR 2 IT 3 Marketing "department_id" is the primary key and the values are unique, however, when referencing the values via the foreign key in the table below, these don't have to be unique in case of a 1:N relationship: employee_id | employee_name | department_id 22 Steven 1 23 Larry 1 24 Michael 1 25 Daniel 3
d
@Moises Morales This back and forth is very circular. I think we both understand that eachothers point. I'm no closer to the solution Your example is essentially the same thing as what I have. i just have the arrow the other way ( and am forced to select a primary key on topic even though I don't want to) as I need to aggregate on cases. This is the actual problem.