Hi! I have a question regarding the data load proc...
# gooddata-platform
w
Hi! I have a question regarding the data load process in Good Data. I have a full LCM managed setup where data is pulled into Good Data nightly through an Automated Data distribution schedule. The data is pulled from Google Big Query where we have a schema setup with the views that Good Data is expecting. The views have x__client_id_ in them so that data can be partitioned into the carious customer workspaces by that ID. My situation is that I also want to load the same data into a "global" all customer model so that I can do overall analysis of customer trends,etc. So I have a separate workspace dedicated to this, and I have loaded the same logical model JSON into the global workspace and have removed the x_client_id references from the global model. I also setup a separate data load for this workspace. I want to load this global workspace using the same views used for the customer specific workspaces. I assumed that since the global model doesn't contain references to x_client_id that it would just ignore them... but it doesn't. Is there a way to load my global workspace using the same views as the customer workspaces? Is there a way to tell the good data loader to ignore the x_client_ids in the data?
f
Hi Willie, if I understand this correctly you would like to load the data from your BigQuery source into a workspace without separating the data by Client ID, correct? The article Use Automated Data Distribution explains how the x_client_ids works, in the Client ID section. Here is the relevant bit:
• If the _x__client_id_ column is not present in the output stage table, no filtering is done and all data is loaded to the corresponding dataset.
• If the _x__client_id_ column is present in the output stage table and no Client ID is specified, ADD fails when the data is being loaded to the corresponding dataset.
Did your ADD run fail or run successfully? Did you try removing the client ID columns from the views for this workspace?
w
The ADD failed because the x_client_id is present. But I don't want to use it for this particular data load.
And I don't want to create a completely duplicate set of output stage tables that are the exact same but with x_client_id removed
m
I am afraid that it is not possible to disable the client_id for a particular workspace. I think there are two possible options here: • for the "all clients" workspace use another set of views that do not have the "x__client_id" but point to the same physical tables. But that probably requires you to have a different data source for it, have it outside the segment and therefore also have a diffetent loading process • Modify your existing views in a way that they return each row twice - once with standard value of x__client__id and once with some special value that you use as the client ID for the all clients workspace. Something like:
Copy code
CREATE OR REPLACE VIEW out_transactions AS
SELECT t.client as x__client_id, t.transaction_id as cp__tid, ...
FROM transactions t
UNION ALL
SELECT 'ALL_CLIENTS' as x__client_id, t.transaction_id as cp__tid, ...
FROM transactions t
w
Thanks for the answer guys. I am going to just replicate all my views.