F.A.Q.

ADD - Distributing Data to Multiple Workspaces

  • 26 January 2021
  • 0 replies
  • 332 views

Introduction

Automated Data Distribution (ADD) is the primary tool used to load data from your data warehouse or data storage into the GoodData platform. One of the key features that ADD offers is quickly distributing data from a single data source across multiple GoodData workspaces. This is especially useful when you’re building customer-facing analytics and have a separate workspace for each of your customers and want each workspace to contain only that customer's data.

 

Background

For the distribution to work, the workspaces need to be organized in a certain way. You can organize them manually using GoodData’s APIs, and lightweight UI called grey pages (available on any pricing tier) or utilize GoodData’s automated LifeCycle Management (available on higher pricing tiers).

The below picture shows the required structure. 

A data product has to be defined and has to contain at least one segment. Each segment has one master workspace, which serves as a template for client workspaces - this is needed for automated workspace provisioning and change management when using LifeCycle Management (LCM) but not of much importance for data distribution itself. Still, it needs to be defined. A segment can contain multiple so-called client workspaces with identical logical data models. These are the workspace the data will be distributed to. Each client workspace has a unique Client ID associated with it as part of the setup. This Client ID drives what data is loaded to the client workspace. The data in an individual client workspace is specific to a client and does not contain any other client’s data.

 

For the data mapping to work, the data source must provide a way to filter the data by the client.
 

Adding Client ID Column to your Source Data

The data source needs to provide a column named x__client_id  in each source table where appropriate. The LDM design will dictate which source tables need the x__client_id column. This enables the GoodData platform to distribute the data across specific clients.

 

vmmC4doQR0LZpeRFkS52V0L5QTVX3jE8ag1EMsj3zHA1lFcwYuOkavGaTBn2K2qisJk4t4p6R5s_-sbLW8IhoRrmZ-C_BKN6IKppOTvwfjDX3FxUP3knGbx2M26ngKaN99faopmt


 

If the x__client_id column is not present in the output table, all of the data in the output table will be loaded to each client workspace in the segment. In some cases, this may be the desired behavior. The LDM design will dictate whether an output table needs the x__client_id column or not.

 

To learn about other features of ADD see also:


0 replies

Be the first to reply!

Reply