Is there a way to create a histogram based upon a ...
# gd-beginners
d
Is there a way to create a histogram based upon a single column of data? Say I have a column with values possible
a, b, c, d, e
. Is there a way to create a histogram where the x axis are the values
a, b, c, d, e
and the y axis are how many times each value occurred in the table?
Also, how do you handle the situation where the values are numbers?
m
Yes, that should be possible and quite simple. You will just need to write a MAQL metric, something like
SELECT COUNT(ID)
where
ID
is a primary key in your table/dataset. (In case of GD Cloud or GD CN m, you can also use
SELECT COUNT(dataset)
). This metric will give you a number of rows. And then use this metric in your chart together with the column. Note taht the column needs to be mapped to attribute in GoodData (not a fact). And you will get for each valie of the column number of rows where it appeared.
d
Oh okay. Well there's another wrinkle, what if I do a
SELECT COUNT (col1, col2)
, and then want to do a histogram of the calculated counts?
m
The two-parametric count in GoodData is a bit special. It does not give you count of the combination, but the first parameter is what you are counting (same as in single parametric COUNT) and the second parameter is a primary key of the dataset where you want to count it. For example COUNT(customer,customerID) would be number of customers in your list of customers. While COUNT(customer,orderID) would be number of customers in your orders tables (=those customers who have some order). This is assuming that you have dataset orders referencing the dataset customers in your logival data model.
If you still want to do count of rows and see it by a combination of multiple attributes (i.e. col1 and col2) you can use the very same metric as in my first reply and just add those two attributes col1 and col2 to your insight to break your metric by them. It will do it by their combination like in a pivot table. Depending on the visualization type you choose, you can have them both on the same axis.
d
Okay, yeah. So maybe I wasn't being as clear about what I needed as necessary. So, what I would like to do is group by a certain column (say
col1
) and then count the number of rows in those groups.
At which point, I would like to create a histogram where the x axis is the number of rows in the group, and the y axis is the number of groups that have that number of rows.
Is this possible to do purely in GoodData?
j
In GD Cloud you can calculate number of rows in dataset per col1 using
Copy code
select count({dataset/dataset_id}) by col1
if you want to count number of col1 which have count in given range you can try following:
Copy code
select count(col1) where (select count({dataset/dataset_id}) by col1) between ... and ...
… are the borders of intervals for the item in histogram. Then add multiple metrics with different ranges into your insight to create histogram.
👍 1
p
🎉 New note created.
d
That worked! Thanks!