Hi there. I have a question on how to implement th...
# gooddata-platform
j
Hi there. I have a question on how to implement the ADDv2. Based on the LDM, which datasets require the x__client_id? Most of metrics and data used in the insights are based on the data in the Answer, and then filtered or aggregated by the other datasets.
b
Hi Josefin, the
x__client_id
is required when you are incrementally loading data to multiple workspaces with the same model (it is done per dataset) - used to decide which row goes to which workspace. So if you have a dataset where you will only load data once (i.e. some mapping or geografy data) in theory they don't have to contain the
x__client_id
column, but any change will require some manual work (might not be possible to do for all client workspaces in batch).
if you won't be using LCM and/or you will have just small number of workspaces with different data models, the
x__client_id
does not have to be used at all, (the data loads will have to be setup separately to each workspace)
j
Thanks. Just to clarify. We will indeed be using LCM and I wish to load incrementally and frequently. So yes, I will be using x_client_id. My question related to specifically if x_client_id needs to be in EVERY table that will be loaded or (as seems most logical) it can just be added to the Customer Table and the whole dataset will then be filtered by it due to LDM setup.
Also, I am unable to load data into the master workspace, and get the error message "The Output Stage has the x__client_id column, but no Client Identifier was provided for the current workspace." Does the set up and dataload have to be different for the master workspaces from to the clients and if so- how would you recommend I set that up? And how till that impact the LCM?
b
Hi Josefin, yes, if the table has
x__client_id
only data for particular client will be loaded, if it has not all data from the table will be loaded to client workspaces. So you can indeed have the
x__client_id
only in customer table an load all (new) values from other table regardless of client
j
Thanks, yes. Specifically now I am trying to work around (or understand) the following "The Output Stage has the x__client_id column, but no Client Identifier was provided for the current workspace.". What am I missing?
b
as for your second question... you will have to set up separate dataload for you master workspace, from different table (or from the same table with some dummy client id) ... i.e. there will be
x__client_id
column with value master to mark down the data you want to load to the master workspace. Basically if the table contains x__client_id column, you won't be allow to trigger dataload without specifying it.
other option is to have a separate set of table and separate dataloading process for you master workspace, since it is from different tables, it doesn't have to contain x_client_id and you can load all the data.
j
How will that impact the LCM and process with release and roll out bricks if the dataloads/ tables are different from the master and the clients?
b
the LCM process is separate from data loads with one exception, which is custom mapping... so if you will be using custom mapping, it will be transferred from master to clients, therefore it should have the same values (but the tables can be in different schemas - schema is not a part of a custom mapping, you specify it on data source level)
m
Hi @Josefin Gruvander - let me try to clarify how we understand “master” in the GoodData Platform in context of LCM. • The “Master workspace” in this context usually means a physical workspace, but it is used more like a template - it usually does not have any data in it and no users are accessing it (except maybe the admin to check something). • When using the LCM bricks, also with every new release a new Master workspace is created. And the original one also serves as a “backup” of the structure a dashboards before the release. • So “LCM Master Workspace” is NOT “an overall workspace with all the data”, but rather a data-less template for all the client workspaces If you would still need the “overall workspace with all the data” apart from the client workspaces I see two options how you could do it for example this way: • Option one - make it one of the clients ◦ if you want to have exactly the same structure of the “overall” workspace as the client workspaces and keep it in sync etc. you can treat this “overall” workspace as a special client ◦ you assign it some client_id which is not used anywhere in your system (i.e. ALL_CLIENTS) ◦ you either physically duplicate the data in your system or better use views to show the same data both for each particular client_id as well as for ALL_CLIENTS ▪︎ something like:
Copy code
CREATE VIEW out_gd_product AS
SELECT product_id, product_name, client_id as x__client_id FROM gd_product
UNION ALL 
SELECT product_id, product_name, 'ALL_CLIENTS' as x__client_id FROM gd_product
◦ put the workspace normally into your segment - you will still be using the x__client_id so everything remains the same both for clients and “overall” workspace. • Alternative option - separate data source ◦ in this case you do not have the “overall” workspace as one of the clients but separatelly next to the LCM segment ◦ you create views on top of your original tables, those view have different prefix so you can use another prefix in another data source. These views do not have the x__client_id in them (to overcome the limitation you mentioned above) ◦ you use this data source for your “overall workspace” and schedule the load in it (not in service workspace in this case, because now the “overall” workspace is not part of the segment and has slightly different configuration. ◦ this method allows you to have somehow different structure for the “overall” workspace but requires manual synchronization of the changes.
BTW let me add a few notes about using the “overall workspace” based on my experience from many different GoodData implementations. While it might be tempting do to it this way as a “copy of the client workspace just with all data”, it is good to think about these questions: • who will be using this and for what purposes? I’ve seen two main classes of use cases: ◦ Internal people working with the clients (i.e. Account Managers, Support) want to be able to see what the clients are seeing in their workspaces ▪︎ for this particular case it might be better to just give access to internal people to the client workspaces - they can easily switch between them, there will be no issues if the insight is custom and is only present in the client workspace or whether the data synchronization in both workspaces already finished or not. ◦ Internal people who care about data across different clients (i.e. product managers, C-level etc.) ▪︎ here the question is if the reporting use cases (and appropriate visualizations and metrics) are the same as for single-client workspace Don’t you actually rather need “overall” workspace with slightly different data model and dashboards? • i.e. will you need a “client” filter or column in the reports and dashboards, which is not present in the client workspaces? • don’t you want different types of reports to compare across various clients than for a single specific client? (i.e. averages per client, • do you need the same level of detail? With data of many many clients in a single workspace, the computation and load performance might be affected as well as the price. From my experience quite often having a combination of access to the client workspaces with separate overall workspace with more aggregated data and visualizations and dashboards tweaked to the needs of users of this combined workspace work very well.
👀 1
j
Hi @Michal Hauzírek, thanks for your reply! So a few follow-up questions on to work with “masters” and the LCM. Mainly trying to understand what the set up and continuous development process would look like when implementing the LCM. Currently, I have set up a development master workspace, that is my workspace I reference as “development_pid” in my release brick. When developing the dashboard and insights, I want to have access to all the data so I can verify that it looks like I want. Given your answer on how the master workspace is a data-less template, does the same apply for the development workspace? If it shouldn’t contain any data, how would I then create the metrics, insights and dashboards? On the topic of data access, I also want the clients to be able to benchmark their own result against the total average of all the clients. Any recommendations on how that can be achieved and implemented? Thanks again, in advance!
b
Hi Josefin, you are right, it makes sense to have some data in development workspace, in order to see how the dashboards/insights will look like. We would recommend some dummy data that will be sufficient for this purpose, no data for all clients as it would be necessary to do regular loads and the size might be substantial. Otherwise, the options mentioned by me and @Michal Hauzírek apply also for the development workspace.
👍 1
m
I see, @Josefin Gruvander let me clarify and distinguish the “development (master) workspace” and “(segment) master workspace” as these might be a bit confusing. “development (master) workspace” • the one where you are developing your dashboards • it is not part of the LCM segment and does not have client_id assigned from the LCM • it needs to have some data loaded- preferably data for a single client ◦ typically you would load them from the particular workspace (deploy ADDv2 to that workspace and define the client_id in the load configuration ◦ if you want you can switch the client_id and reload data for another client if needed during the development “*(segment) master workspace*” • the one which is data-less template • is created automatically from the “development (master) workspace” • does not have any client_id assigned If the data for different clients differ a lot (i.e. one has views but does not have transactions and vice versa while your model contains both) you can set up the LCM brick in a way that you also have a development segment where you create clients and load them with data and check how different clients dat behave. I hope this clarifies it a bit.
j
Hi @Michal Hauzírek. Thanks for the answers and support during yesterday's office hours session. You showed an illustration of the LCM architecture that was very good, could you please share that? Thanks!
🙏 1
m
Sure, here it is.
🙌 1
j
Perfect, thank you! 🙏🏻