I'm having a tough time getting a simple summation...
# gooddata-cloud
p
I'm having a tough time getting a simple summation query to run against a BigQuery-based dataset. I have a fact "size" I'm summing and an attribute "key" (just a string value). "key" has about 50M unique values. Attempting to do such a sum and get back the expected 50M-ish rows (hopefully paged?) isn't returning anything at all -- "computing" for a long time and then finally failing and pointing me to my admin. šŸ™‚ what should I expect when attempting to get back lots of rows? is that not possible on the platform -- no paging or similar mechanism to make the data size tractable?
j
Hi Philip, 50 M rows is quite a bit past the limit of 1 M cells: https://www.gooddata.com/developers/cloud-native/doc/2.3/deploy-and-install/cloud-native/execution-limits/, You can review our limits in the provided doc, but Iā€™m afraid this insight is just too large.
p
I see, so no paging occurs to allow for that large of a dataset -- my misunderstanding
j
Generally, we store resultsets in pages in our caches and provide only requested pages to UI apps. But, we have to download the whole resultset to our caches to be able to sort/pivot on our side. Streaming 50M rows through BigQuery REST API may be very slow, causing the SQL execution to timeout. It is possible to add a TOP(x) filter and display TOP(x) rows quickly - we push down such filters to data sources.
šŸ™Œ 1