Hi I hope it s ok to ask a MAQL question here I am still lea GoodData #gd-beginners

Hi! I hope it’s ok to ask a MAQL question here. I ...

Allan Talver

09/24/2021, 2:21 PM

Hi! I hope it’s ok to ask a MAQL question here. I am still learning to write the queries and I’m stuck with something. I am attaching a snippet of my LDM covering the relevant part for what I am working on. The JIRA dataset is collapsed because it has so many facts and attributes, which would add tons of irrelevant details. It is relevant to know that JIRA dataset holds information about employees: • Username - attribute which is an identifier for each employee • Company they work for (reference to Companies dataset) • Day of employment (reference to Worked Date) • Availability - A fact that the person was employed on a day (represented as value of 1) Working Days dataset holds a workday/holiday pattern for each company dictated by local employment regulations (1 if the day is workday, 0 if it’s not). What I am trying to do is to build a metric which eventually can be aggregated by company or even the whole group of companies, showing the number of full time employees (FTE) in a selected time period. My plan is to achieve that by finding: 1. Number of workdays for each employee in a chosen time period (SUM of Workdays in JIRA dataset for that person in the selected time period) 2. Number of workdays in the company of that person in a chosen time period (SUM of Workdays in Working Days dataset for the company that is associated with the person in the JIRA dataset) And then dividing the first number by the second. For a person that was employed throughout the chosen time period, the result is 1 because the amount of workdays for the individual person matches the number of workdays for their company. However, if the person only worked part of the period, then their FTE number is less than 1. I am struggling with handling the calculation for persons who have joined in the middle of the period. Check the second screenshot, representing one person per line. 14 and 15 of August on the screenshot represent a weekend, correctly showing that it was not a working day for the employee, nor the company. If you look at the 5th line, then you start seeing the problem. The line shows a person whose employment started on 16 August and this is correctly shown by the Employee WD metric, but the problem is that Company WD is not even shown for rest of the chosen time period. Meaning that if I add up the numbers for the person over the time period, I still get Employee WD matching the Company WD and the resulting FTE result is 1. I am thinking that this behaviour happens because the connection point between the person and the company only exists for the days for which I have records in the JIRA dataset. Is there a way to build a metric which would find the number of Company WD for the full chosen time period, not only for the days the person worked? I guess one way to achieve that is to populate the JIRA dataset with 0 values for the days when the person wasn’t employed, but is there perhaps a more elegant solution?

Allan Talver

09/24/2021, 2:21 PM

I’ll post here screenshots of the metrics that I have build based on this LDM.

Allan Talver

09/24/2021, 2:24 PM

Employee WD (you can ignore the Unpaid Absence part, which is not really relevant for this discussion).

Allan Talver

09/24/2021, 2:24 PM

Company WD

Allan Talver

09/24/2021, 2:25 PM

FTE

Allan Talver

09/24/2021, 2:28 PM

By the way, this topic is essentially a follow-up to what was discussed here: https://community.gooddata.com/metrics-and-maql-57/struggling-with-maql-aggregation-and-filtering-224 After the Community post I eventually still decided going forward with importing the company workday/holiday patterns into GoodData. At first I tried to avoid doing that.

Allan Talver

09/24/2021, 2:28 PM

I would be very grateful for any help/ideas/insights. I have been stuck with this for some time now. 🙂 Thanks!

Jakub Sterba

09/24/2021, 4:43 PM

Hi @Allan Talver, Let me give you some hints which may help. There can be problem with “NULL arithmetics” in the expression:

Employee WD / Company WD

If there is no record for employee in Jira dataset the metric Employee for that date has probably NULL value. The result of division is: • 1/1=1 • 0/1=0 • null/1=null I suggest to wrap Employee WD by IFNULL function:

IFNULL(Employee WD,0) / Company WD

Query engine will interpret null value (missing data) as 0 and result of IFNULL(null,0)/1 should be 0. I suggest not to use Records of Jira in metric expression of Company WD. So that it returns value even for days for which none record exists. I suggest something like:

select sum(select ifnull(sum(Availability -Unpaid (Absence)/8),0)/sum(Workday) by Username, Date(Work Date))

Allan Talver

09/24/2021, 4:48 PM

Hi @Jakub Sterba, Thanks for the quick response. I will delve deeper into your suggestions in a moment, but I can comment on the “Records of Jira”. I used that as a link to match a Company for an employee (Username attribute in the Jira dataset). Basically I was trying to follow the logic explained in this article: https://help.gooddata.com/doc/enterprise/en/dashboards-and-insights/maql-analytical-query-language/maql-use-cases-and-tutorials/explicit-lifting

Allan Talver

09/24/2021, 4:51 PM

I believe that I can’t SUM workdays (in Working Days) dataset by Username (in Jira) dataset directly. Or can I?

Jakub Sterba

09/24/2021, 4:51 PM

I see, Username is just in Jira dataset. Let me think about it again 🙂

Jakub Sterba

09/24/2021, 5:15 PM

What if you do this?

select count(Username) where

(select

(select sum(Availability - Unpaid (Absence)/8 ))

(select max(select (select sum(Workday) by Company) by Records of Jira))

by Username)=1

Jakub Sterba

09/24/2021, 5:16 PM

The idea is to compute number of work days per company and repeat the result in each record of Jira dataset using the lifting.

Allan Talver

09/24/2021, 5:18 PM

Hmm, let me try that.

Jakub Sterba

09/24/2021, 5:18 PM

so in a context of a month, total number of workdays of a company can be 20 and this value will be used for each record in Jira dataset of all the company employees

Jakub Sterba

09/24/2021, 5:20 PM

employee days will be calculated as SUM and MAX function is used to pass the company days to Username level

Jakub Sterba

09/24/2021, 5:22 PM

I am not sure how complex is the use case and if you need to care of User being in relation with multiple companies. This would add additional complexity.

Allan Talver

09/24/2021, 5:24 PM

Yes, the User can exist in various companies over time (but not at the same time). Not sure if that will invalidate your approach? Unfortunately I still feel quite “green” with the MAQL things, so I have to first try it out to be able to evaluate what will happen. 😄

Allan Talver

09/24/2021, 5:25 PM

But I am currently creating that metric in GoodData, so I will test it anyway.

Jakub Sterba

09/24/2021, 5:59 PM

In that case it may be the way to count users by company and then sum it up

Copy code

select sum(select FTE_defined_above by Company)

User would be not counted as FTE if the user worked few days for one company and rest of the month for another even if he worked in all the days.

Allan Talver

09/24/2021, 7:45 PM

Thanks for your help with this @Jakub Sterba! I have to stop for today, but I definitely got some ideas that I can continue working with on Monday (if not sooner). But a general question, how can one build up their knowledge about more advanced uses of MAQL? I have gone through GoodData documentation as well as the University courses, but I still feel like something is missing and on my own I often get stuck, at some points more or less just trying to combine different statements together to see if anything will take me closer to what I need. I am sure that this is not the most efficient way to build up the experience. Is there another course or other resources that could help me in this? 🙂

Allan Talver

09/30/2021, 1:27 PM

Hi, @Jakub Sterba and others! I am still stuck with this one. Perhaps someone has some suggestions. When trying out the metric suggested in the thread, I ran into a problem. Namely, we have companies popping up regularly (often 3rd party contractors). When a new company with their first employees appears in the middle of a filtered period, the FTE calculation results in an incorrect number. The reason being that lifting with “Records of Jira” only takes into account the company workdays for which there was an employee in the company, not the full period. Say, for example, I select one month period for my Worked Date filter. That month has 20 working days according to the local regulations of the country where the company is located. Those 20 working days are reflected in the Working Days dataset. A new person joined in the middle of the month and only works for the last 10 days of the month. The result of FTE for that person still is 1 because they were available for 10 days (in Jira dataset) and it gets divided by the matching 10 days that are found from the Working Days dataset. I am guessing that the MAX function isn’t helping if the person happens to be the first one in that company. Is there a way how to have all the days covered by the filtered period taken into scope from the Working Days dataset, regardless if there was anyone working on some of these days or not?

Jakub Sterba

09/30/2021, 1:34 PM

Hi did you try this?

Copy code

select count(Username) where 
(select
  (select sum(Availability - Unpaid (Absence)/8 ))
  /
  (select max(select (select sum(Workday) by Company) by Records of Jira))
by Username)=1

Jakub Sterba

09/30/2021, 1:36 PM

the submetric:

Copy code

select (select sum(Workday) by Company) by Records of Jira))

shall lift number 20 from your example to all the records of the Jira dataset of the company regardless if a person worked 20 days or 10 days Metric

Copy code

select max(select (select sum(Workday) by Company) by Records of Jira))

should display value 20 for each Username from the company if you slice it by username

Jan Soubusta

10/01/2021, 6:57 AM

Just out of my curiosity - would not it be more efficient to extend the model by a dataset, which would contain number of working days (based on the regulations) per country? This dataset would be connected by country attribute with the rest of the model. You would not have to calculate number of working days from Jira issues dynamically. Btw. this dynamic calculation seems also to be a potential performance issue, imho.

Allan Talver

10/01/2021, 1:34 PM

Hi @Jakub Sterba I checked this once again and actually you were correct, it does seem to work. I had a problem with underlying data which led me to believe that the metric is showing an incorrect result for some employees. But actually the data was broken. However, I have two follow up problems/questions. Currently the metric looks like this:

Allan Talver

10/01/2021, 1:35 PM

I didn’t do the counting part because we can have fractional employees who should also be included in the aggregated number.

Allan Talver

10/01/2021, 1:46 PM

It appears to work like this. I have 3 persons selected in 1 one company and 1 in another (they have just been selected in the Username filter). Worked Date filter is set for This Year. The first employee started in mid august and 0.34 makes sense. 3rd one started in the beginning of March and 0.8 also makes sense. However, the 4th person worked in another Company until they moved to the new one (which is being show on the screenshot as 0.27). For some reason the previous employment for that person isn’t shown. In total, that person has worked for 5 years, including all of the ongoing year. And also, the person on the second line has worked for full year without interruptions. So I would expect to see 1.0, instead of 0.95. But I will have a look into the underlying data to see if I can make sense of this difference.

Allan Talver

10/01/2021, 1:54 PM

And the other question is about silicing. I would like to be able to slice that metric by smaller time periods, for instance quarters, months or even weeks. If I slice it, then I get the result shown in the screenshot. I would like the metric to result in 1.0 if a person has worked all days of the month. I am guessing that I would have to somehow use averages? Or is it because the divisor of my calculation is not sliced by month for some reason, and therefore the number of workdays in a month (for example 20) gets divided by number of workdays in the full year (for instance 250)?

Allan Talver

10/01/2021, 2:01 PM

@Jan Soubusta, actually the dataset “Working Days” is supposed to do almost that. In that dataset I have the number of working days for each Company. I am aware that for efficiency I could actually have another level in between and have that data per country (as we can have more than one company in a country). When I initially began working on that metric, I posted a Community post: https://community.gooddata.com/metrics-and-maql-57/struggling-with-maql-aggregation-and-filtering-224 Back then I was trying to do the calculation by calculating the number of Workdays per company by finding the max number of workdays that any single user had worked per company. But as you suggested, it’s not really the best way to do it dynamically and it also caused the metric to result in a distorted value if there were months where no one had worked all the days (for example a new company appearing in the middle of a selected time period).

Allan Talver

10/01/2021, 2:10 PM

Ok, out of the problems/questions that I posted above, I think I have now figured out two issues. The 0.95 showing for a person who worked for full year is due to an issue with underlying data. Not about the metric. And also, the issue with the person who switched company mid year is very likely due to problem with data as well (working days not imported for the company where the person worked before the switch). So it looks like the only thing missing at this point is how to make the metric sliceable by shorter time periods. 🙂

Jakub Sterba

10/02/2021, 1:09 AM

I think approach with lifting is not so flexible if you need to slice the metric by different date attributes, but I think I have found a better solution. I suggest to use following metrics: Sum of Availability:

Copy code

select sum(Availability)

Work days:

Copy code

select sum(Workday) by all Username

FTE:

Copy code

select Sum of Availability / Work days

The trick with “by all Username” will compute the sum(Workday) in context of insight (Company, Month/Year, Username) without the Username (Company,Month/Year) and result will be repeated for each value of Username. Without the “by all Username” the sum(Workday) cannot be broken down by Username. The value will be available in insight only for the months where exists relation between Username, Month, and Company. See example with metrics below. Company A has working days Mon-Fri and Company B Sun-Sat.

Jakub Sterba

10/02/2021, 1:10 AM

Screenshot 2021-10-02 at 02.56.27.png

Jakub Sterba

10/02/2021, 1:23 AM

You may need one more metric to find for example the highest FTE of user if you want to display the ratio without breakdown per company and some users work for multiple companies. FTE2

Copy code

select max(select FTE by Company)

Jakub Sterba

10/02/2021, 7:32 AM

Finally number of FTEs:

Copy code

select count(Username) where (select FTE2 by Username)=1

Jakub Sterba

10/02/2021, 7:33 AM

Screenshot 2021-10-02 at 09.33.02.png,Screenshot 2021-10-02 at 09.33.16.png

Jakub Sterba

10/02/2021, 7:41 AM

Or as one metric:

Copy code

select count(Username)
where (select max(select sum(Availability)/(select sum(Workday) by all Username) by Company) by Username)=1

Jakub Sterba

10/02/2021, 7:53 AM

You can slice the metric by any combination of attributes Company, Username, Year, Quarter/Year, Month/Year, Week/Year, Day of Week, or any other date attribute.

Jakub Sterba

10/02/2021, 8:01 AM

or even compare it to previous periods

Jakub Sterba

10/02/2021, 8:02 AM

Screenshot 2021-10-02 at 10.00.26.png,Screenshot 2021-10-02 at 10.01.21.png

Allan Talver

10/04/2021, 8:53 AM

Oh, wow! Thanks @Jakub Sterba! I’m recreating and testing it in our environment now. But I have one question about one pattern I see in your metrics. Could you explain why you decided to use the count function? Is the intention to only count the employees who have their FTE number equal to 1 (and ignore others, like part timers, etc.)? In my case I would have to include all part timers as well and the end result should essentially be a number of “full time equivalents”. For instance, a company has only two employees who only work on half of the days, giving 0.5 load for both of them, should be added, resulting in 1 FTE for that company. So I believe that in my case I should be using SUM, instead of COUNT.

Allan Talver

10/04/2021, 9:44 AM

In general, it looks good and slicing by company, username, date is ok. However, I have a follow-up issue here. Perhaps I should have mentioned beforehand that Jira dataset has some additional attributes that I would like to use for slicing or filtering (for instance employee status: “employed”, “external contractor”, etc., also the team membership of the employee on each day). At the moment, these attributes are not available if I use the new metric in Analytical Designer.

Allan Talver

10/04/2021, 9:46 AM

These attributes are there to describe the “Availability” fact. Working Days would still remain the same for the company.

Jakub Sterba

10/04/2021, 9:48 AM

try to list them in the metric definition:

Copy code

select sum(Workday) by all Username,Employee Status, ...

Allan Talver

10/04/2021, 9:54 AM

Hmm, then I get the “Sorry, we can’t display this insight” message

Jakub Sterba

10/04/2021, 9:56 AM

sorry, my bad

Jakub Sterba

10/04/2021, 9:56 AM

Copy code

select sum(Workday) by all Username,all Employee Status, ...

Jakub Sterba

10/04/2021, 9:56 AM

you have to repeat “all” 🙂

Allan Talver

10/04/2021, 9:56 AM

Oh, right. Let me try that 🙂

Allan Talver

10/04/2021, 9:58 AM

And there is no shorthand equivalent to something like “take them all”? 😄 BY ALL Records of Jira probably has some different meaning? Tried that, but then got weird results from the metric.

Jakub Sterba

10/04/2021, 9:59 AM

there is by all in all other dimensions

Jakub Sterba

10/04/2021, 9:59 AM

but you need to slice it by attributes like Company, Year, Month/Year …

Jakub Sterba

10/04/2021, 10:00 AM

in theory you can do by all in all other dimensions except Company, Year, Month/Year …

Jakub Sterba

10/04/2021, 10:00 AM

you may try by all Records of Jira

Jakub Sterba

10/04/2021, 10:00 AM

I am not sure about outcome

Allan Talver

10/04/2021, 10:00 AM

It does appear to work now when I listed the attributes one by one. I’ll do some checks to verify if the numbers make sense.

Jakub Sterba

10/04/2021, 10:02 AM

I think Records of Jira won’t work because it will ignore also slicing by Company and date attributes

Jakub Sterba

10/04/2021, 10:03 AM

It is good to hear it is working now

Jakub Sterba

10/04/2021, 10:36 AM

But I have one question about one pattern I see in your metrics. Could you explain why you decided to use the count function?

I probably misunderstood requirement in the initial question:

What I am trying to do is to build a metric which eventually can be aggregated by company or even the whole group of companies, showing the number of full time employees (FTE) in a selected time period

So I interpreted it that you need to count number of employees who work full time.

Allan Talver

10/04/2021, 10:37 AM

Aha, ok. Thanks! Probably I wasn’t clear enough in my explanation. The intention is to get a sum of the contribution of all employees (even if they are fractional)

Open in Slack

Previous Next