I am new to GoodData and I am evaluating its use f...
# gd-beginners
g
I am new to GoodData and I am evaluating its use for our business use case. I did have a few questions and perhaps someone can provide some insight! 1. Is GoodData loading the data from the Source? I need to know for compliance reasons or is it pulling from the Data warehouse (Snowflake) 2. I have a multi-tenanted saas application with companies in different timezones. So in my case one client workspace would be in GMT +4, another in GMT+0 and another in GMT -8. The data is stored in all UTC so the date/filter needs to adjust based on the user/workspace set timezone and query accordingly. Is this achievable? - i couldn’t’ find any documentation on this
m
Hello GJ! 1. This depends on the version you are using. If you use the main Platform, you always need to load your Data to GoodData. We do have options for storing data in specific locations and what is currently available is listed here, however, GoodData.CN provides a solution, where customers keep the data on their side, but it also needs you to manage the whole application from your end. It’s an analytics platform focused on the semantic layer, reusability of metrics, business user self-service, and a multi-tenant environment. You can find the architecture described here. A list of supported data sources can be found in our documentation. 2. May I know which version of our platform are you using? This article may be useful for you: https://help.gooddata.com/pages/viewpage.action?pageId=86788948
👍 1
g
@Moises Morales thank you kindly for response. I am using the cloud version right now to trial out the solution and understand the proof of concept. on #1 - if i am already using snowflake cant GoodData just use that? I mean is it necessary to make a 3rd copy of the data? (Original (postgres) + Datawarehouse (Snowflake) + now good data?) #2 - Thanks! I was looking for same, and to understand this correctly then i would setup each client on my side as a “user”? OR is there a different route (perhaps as a parameter) for when we use embedded? The reason I ask it then seems like when I onboard a client I need to provision something over in Good data?
j
#2 the recommended setup would be to have separate workspace for each of your tenant. In case a tenant has multiple users all of them would see the data in the same timezone. You can convert UTC timestamp in Snowflake to desired local time zone for each of your tenants (e.g. via some view where client_id column identifies tenant and where dates are in tenant’s local time zone). Then you should set the same time zone in workspace configuration on GoodData side so that date filters like “last 7 days” use the appropriate time zone. GD.CN does not support multiple time zones yet.
p
🎉 New note created.
g
hi @Jakub Sterba thank you for your quick response. Are you saying that each client can have a their own workspace and the timezone can be scoped to that workspace? Or There is no native support for multiple timezones so all clients essentially their workspaces are tied to the master workspace timezone?
j
I assume your client is an organisation with many users. Each client has dedicated workspace where multiple users of the client share insights with each other and all of them look at the same data in the same time zone. This time zone is configured for the workspace and dates in database shall be provided in that time zone for that client. Each of your clients can have different time zones configured for their workspaces.
g
@Jakub Sterba Thanks again. You are correct a client is an organization with many users. In my case, at least at the moment, i dont need to scope the timezone to the user just the org. For example • company A = GMT +4 • company B = GMT -8 All data is in a single Database in UTC format. So when you say “database shall be provided in that time zone for that client.” Do you meant to say that the dates will auto convert based on the timezone to UTC and query accoridngly? Or the data in the database should be in the timezone of the user?
j
The data is distributed to workspaces using ADD v2 process. You can find how it deals with timezones here: https://help.gooddata.com/doc/enterprise/en/data-integration/data-preparation-and-dis[…]-zones-in-automated-data-distribution-v2-for-data-warehouses You can probably split your workspaces into multiple segments per timezone. Each ADD v2 process would have different timezone parameter and would convert data from UTC to different time zone. Other option is to use DB function CONVERT_TIMEZONE in Snowflake data warehouse to convert data from UTC time zone in column with TIMESTAMPZ type into DATE type in given time zone for the workspace. The date dimension in GoodData does not have any information about the time zone. The time zone setting of workspace influences only which date is the current date for filters as far as I understand.
g
Hi @Jakub Sterba I think I understand. To be sure I got it correct. Data is loaded from my source (Snowflake) and then as I create client workspaces it is distributed to workspaces which are client specific can be associated to a specific timezone. In that way each client workspace is tied to a specific timezone and the date filters will adjust accordingly. Correct?
j
yes
g
fantastic!
thank you!
but that does beg the question, is data being duplicated?
j
in the hosted GD platform the data is copied to workspaces. Each workspace contains data of only one client which boosts performance of client’s analytics. In GoodData Cloud Native platform which is not hosted by GoodData the data is not copied and query of each client is executed in data warehouse (Snowflake). GD.CN platform applies some caching of results but does not replicate the detail data. This platform does not support multiple time zones yet. You would need to deploy it in multiple time zones at the moment to support clients in multiple time zones until we deliver similar functionality as in hosted GD platform.
g
hi @Jakub Sterba that means the only option is the GD platform which will make copies of the data - the GD.CN platform is not a viable option as of yet then.
is it copying the entire data or just what is specific to that client defined by the Row Level Security
j
It is copying only the data defined by Row Level Security (based on x__client_id column) into workspaces where clients are well isolated. Only tables with common content which do not have x_client_id column are replicated if such tables exist.
The tables exposed as dataset (output stage) are typically in separate database schema or can be identified by the same table name prefix. You do not need to copy entire data warehouse and select only data marts which you want to share with your clients.
g
So its not actually that concerning because the data is actually partitioned of sorts and provides a better security so there would no chance for contamination, unless the token was hijacked or there was actually some sort of bug on the GD side during this partitioning process.
j
exactly