I was looking into the data usage of our workspace...
# gooddata-platform
t
I was looking into the data usage of our workspaces, and we’re currently pretty much over our limits. to analyse deeper, I’d love to understand: • how does workspace size distribute to the different datasets? • within a dataset, which fact / attribute columns take up how much space? I can of course compute that directly on our redshift data source, but it appears that space is calculated very differently on GoodData
i
Hi Thomas, Actually, the size of the workspace may differ from the size of uploaded data. There are other factors which might affect the actual size(cache, all the metadata, ...). Also, the size of compressed data package may differ from the size of uploaded data. I am afraid that we are not tracking the size on dataset or a single object level right now. But you can get some hints from the Manage section of your workspace after choosing the preferred dataset. There you should see the Upload History.
t
Hi Ivana, thanks for the answer. I didn’t know about the history of upload sizes. Anyways, this doesn’t really help me much on improving / managing the workspace sizes, since I can’t actively monitor and spot problems. I’ll for now resort to data size on redshift as a proxy then. Maybe one additional question: which data size is the one used for the workspace size? • size of the data in a CSV file (so, all text) • size of the data in the data types from the data model (the labels) • a compressed version of the previous points?
i
Sure thing, you are welcome. Additional info can be retrieved also from your GoodSuccess workspace - your Account Owner @Julie Mullen can help you with navigation through the workspace. I believe it is uncompressed data size plus the cache and all the metadata(Dashboards, Insights, Metrics, attributes and their labels, facts …).
t
thanks!
m
Hi Thomas, let me step in here. If speaking about “Customer Data Size” which is the metrics your contract is based upon, it is defined like total size of uncompressed Customer Data stored in the workspace database. Technical artifacts (indexes, caches,…) are not included. (The definition is included in “Product Specific Terms”). Please note that also date dimensions are counted towards customer data size.
👍 2
🙌 1
t
thanks for clarifying @Marek Zelc!
w
If you look in the log after your ADD runs, in the section of the log titled ====================== Downloading and integrating data ======================" it lists every workspace and every object's bytes. I have found this to be a decent indicator for what the success dashboard indicates is your workspace size.
🙌 1
t
oh nice! That’s not super straight forward, but I could parse it out and actually have a dataset to work with!