Analyzing "multi-select" values

Question

Hello,I think I need a guidance on the data modeling issue.In our app we have multi-select fields that we want to analyze in GoodData. From the user perspective it's possible to fill out some forms with fields of different types, e.g. text fields/numeric fields and select fields. The select fields are basically single select and multi select ones. All the fields are custom (means they are a separate entity in the database and can be created/deleted/modified, they are NOT a static attribute).From the database perspective we have a model called “Scoop” which can have many “Fields”, and those “Fields” can have multiple “Choices” if the field type is multi-select. And now we want to analyze such fields and their values. Example:Field: Panel TypesField Choices: Panel 1, Panel 2Report: Show % of scoops with Panel1 and Panel2 vs. those with just Panel1. (pie chart)Is that possible to achieve?If you need any more details just let me know.Thanks in advance!

Boris · Accepted Answer

Hi Niko,It should be possible to do this, although there is no single direct solution. I would come down to the format of how the data are exported from your application.Without further knowledge of your solution I can propose two ways of how to model this:1: Duplicated lines for each user for each selection:UserChoiceUser_1Panel 1User_1Panel 2User_2Panel 1User_3Panel 1User_3Panel 2Then you can solve your case with followingmetrics:First you count number of choices per user (named Choice by User):SELECT COUNT(Choice) BY UserThen you count number of users with only one choice and with both choices:SELECT COUNT(User) WHERE Choice by User=1 (named i.e 1Choice)SELECT COUNT(User) WHERE Choice by User=2 (named i.e. 2Choice)Then you can visualize this in pie chart by putting these two metrics.To have percentages you can work further with these numbers, i.e.:Percentage of 1 = 1Choice / (1Choice + 2Choice)Of course your use case might be more complicated and you might have more than two choices, but that can be handled by filters directly in the metric:I.e.:SELECT COUNT(Choice) BY UserWHERE ChoiceIN (Panel 1,Panel 2)Also, with this approach, you might need to generate some unique ID to identify each line (or use compound key User + Choice).2: Second way would be to have a separate attribute for every choice:UserPanel 1Panel 2User111User210User311And the metrics would be similar - you would be creating separate metrics to count the users that selected the values.Third approach would be to have all the selected choices in one string, i.e. “Panel 1,Panel 2” and use LIKE clause to extract the individual values, but I don’t think this is optimal, it would definitely require the most work at the end.If would say the first option is more flexible - you don’t have to change the model if you would add a new field, but the metrics would be more intuitive in the second option.But as mentioned, it would come down to how you are able to extract the data.I hope this helps.Best regards,Boris from GoodData Support team

niko · Answer

@Boristhanks a lot Boris for that explanation. I just wanna ask about option 2, because in my app I can have a lot of custom choices which I believe would not be suitable if I choose this option? Would that require creating separate dataset and putting there placeholders for that? At the moment it’s not a problem to make data transformation on my side, since I’m uploading custom CSV to work with GoodData.Thank you

Analyzing "multi-select" values

3 replies

Reply

User	Choice
User_1	Panel 1
User_1	Panel 2
User_2	Panel 1
User_3	Panel 1
User_3	Panel 2

User	Panel 1	Panel 2
User1	1	1
User2	1	0
User3	1	1

Reply

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded