Michael DeWulf
11/09/2023, 10:49 PM{ start: Date, end: Date, ... }
. I'd like to be able to return records from this dataset where either start
or end
is in the selected interval of some Date dataset. Is this possible? If so, how?Steve Fox
11/09/2023, 10:54 PMSteve Fox
11/09/2023, 10:54 PMSteve Fox
11/09/2023, 10:58 PMMichael DeWulf
11/09/2023, 11:09 PMSteve Fox
11/09/2023, 11:15 PMSteve Fox
11/09/2023, 11:16 PMSteve Fox
11/09/2023, 11:16 PMMichael DeWulf
11/09/2023, 11:17 PMMichael DeWulf
11/09/2023, 11:17 PMSteve Fox
11/09/2023, 11:18 PMSteve Fox
11/09/2023, 11:18 PMSteve Fox
11/09/2023, 11:18 PMMichael DeWulf
11/09/2023, 11:19 PMMichael DeWulf
11/09/2023, 11:20 PMSteve Fox
11/09/2023, 11:21 PMMichael DeWulf
11/09/2023, 11:23 PMSteve Fox
11/09/2023, 11:24 PMMichael DeWulf
11/09/2023, 11:25 PMMichael DeWulf
11/09/2023, 11:28 PMSteve Fox
11/09/2023, 11:29 PMJakub Sterba
11/10/2023, 12:37 AMselect "id","start_date" as "date",'start' as role from interval_dataset
union all
select "id","end_date" as "date",'end' as role from interval_dataset
Filter for “Date” dataset will be then applied effectively as OR filter for start and end dateJakub Sterba
11/10/2023, 1:14 AMselect "disconnected_date"::date from generate_series('2023-01-01'::date,'2023-12-31'::date,'1day'::interval) as "disconnected_date"
This date dimension should have no connection to any dataset in the model and it will be used only to determine minimum and maximum date matching the filter selection. BY ALL OTHER keyword lets you use the result as “constant” in expressions of disconnected datasets.
You can define filter using this disconnected date dimension in insight first and then apply following metric:
select count( {dataset/interval_dataset})
where {label/start_date.day}
between
(select min( {label/disconnected_date.day},{dataset/materialized_date_dim}) by all other)
and
(select max( {label/disconnected_date.day},{dataset/materialized_date_dim}) by all other)
or {label/end_date.day}
between
(select min( {label/disconnected_date.day},{dataset/materialized_date_dim}) by all other)
and
(select max( {label/disconnected_date.day},{dataset/materialized_date_dim}) by all other)
This metric will return count of rows which match the filter condition. It can be used in breakdown of interval_dataset to filter only matching records.Productboard
11/10/2023, 1:17 AMJakub Sterba
11/10/2023, 1:45 AMMichael DeWulf
11/10/2023, 2:05 AMMichael DeWulf
11/10/2023, 2:11 AMMichael DeWulf
11/10/2023, 2:12 AMMichael DeWulf
11/10/2023, 2:12 AMJakub Sterba
11/10/2023, 7:47 AMJohn Munson
11/13/2023, 10:54 PMProductboard
11/14/2023, 8:26 AMJohn Munson
11/14/2023, 4:25 PMcampaign_name start end ad_spend
new shoes 1/1 3/31 $1,000
This doesn’t really work with how GoodData wants you to model the data. GoodData wants either discreet standalone datapoints, like the purchase data above, or it wants continuous fluid lines that are comprised of many data points.
The campaign data model above is kinda like how you would model the purchase data. Each campaign is just a single row in a table. Meaning, each campaign is just a single data point. Meaning that, even though you have a start and end date, you have to choose one or the other that is going to represent the temporal space that the campaign occupies… not ideal.
So, what are the alternative options?
First, you could recognize that if you start zooming out far enough, lines start to look like individual points. That’s right, we’re talking about grain and granularity. Perhaps we know that our marketing campaigns are sometimes short, maybe just a few days long, but that they always fall within a business quarter. They don’t run longer than a business quarter and they don’t span multiple business quarters.
With this in mind, we know that the highest “resolution” that we can look at our campaign data and have it still appear to look like dots instead of lines, is “business quarter”…
campaign_name business_quarter ad_spend
new shoes Q1 $1,000
But there’s also a second option. Instead of making our campaigns look like dots, we could make them look like continuous fluid lines. For each campaign, maybe you have the data collected at a weekly level. Or maybe you don’t, in which case you can simply “atomize” your data by smoothly dividing it. The key with this approach though is to understand that the concept of grain/granularity is still important. Since we’ve modeled the data at a weekly level, that is the highest resolution that we can look at our campaign before we start to see gaps and holes in the line, which is a very similar problem as when we modeled our campaigns with a start and end date… two definitive points in time with one big hole in the middle.
campaign_name week ad_spend
new shoes 1 $85
new shoes 2 $85
new shoes 3 $85
...etc
With this approach, our campaigns start at the appearance of the first data point, they end at the last data point, and you answer questions about them by aggregating the facts from every data point that belong to them. In addition, modeling the data this way allows you to answer questions about partial date ranges, e.g “How much ad spend, across all campaigns, was there just in week 3?”John Munson
11/14/2023, 4:42 PMJohn Munson
11/14/2023, 5:03 PMIvana Gasparekova
11/16/2023, 10:46 AM