Hello we re on <http GoodData CN|GoodData CN> 3 29 We re see GoodData #gooddata-cn

Hello, we're on <GoodData.CN> 3.29. We're seeing t...

Pete Lorenz

05/20/2025, 5:40 PM

Hello, we're on GoodData.CN 3.29. We're seeing this failure in the sqlExecutor logs which seems linked to certain insights not rendering:

Copy code

action=nonRetryableCallFailed location=Location{uri=<grpc://10.161.93.82:16001>} flightAction=GetFlightInfo message="Call failed. Reason: Flight 'cache/raw_tmp/93ddf4b6-9448-45c5-af8d-a51bde47f676/21248fc8a6ff45a9b19445b64cbb76be/31eff337-aaf7-4738-89f2-f8008ff63009/2723cb2c-65d7-4fd8-b124-ad2ec64bfe2c/535c5a3c8308d2cfb5762ee85ce587a1' does not exist.. Detail: Failed"

To help us debug this issue, we'd like to know what causes this failure and how serious it is. Is there anything we can do to make our calls retryable (instead of nonretryable) so we can avoid these failures? 🙏

Robert Moucha

05/20/2025, 6:55 PM

Hello Pete, may I ask what

durableStorageType

type for quiver pods do you use? https://www.gooddata.com/docs/cloud-native/3.35/install/installation-configuration/#storage We support AWS S3 or file system. In case you use file system (

FS

), you need to attach volume with accessType

ReadWriteMany

(single volume can be mounted to multiple pods, e.g. NFS, AWS EFS, or similar). If you don't specify storageclass supporting

ReadWriteMany

access type, files will be kept locally in pod that handled the request. When subsequent request tries to access this file, you may face this issue if request is handled by a different pod.

Pete Lorenz

05/20/2025, 6:57 PM

Hi Robert, thanks for getting back. We use "S3" as the durableStorageType.

Dan Homola

05/21/2025, 7:44 AM

Hi Pete, is this issue persistent or does it resolve on retry? Are you running any cache invalidations at the same time by any chance? We have seen such errors occasionaly when running the uploadNotification API during a running report execution 😕

Pete Lorenz

05/21/2025, 4:09 PM

Hi Dan, no, we are triggering no active cache invalidations at the time when we see these errors. The error does seem to resolve on retry, but this is very disconcerting for our clients who are generating large batches of reports. When they find that several reports fail due to this exception, they then have to hunt through a long list of reports to identify which ones failed and re-run a smaller batch of failed reports. They do not know that we are using GoodData, so they complain to us that our system is not reliable.

Dan Homola

05/22/2025, 7:52 AM

I see 😕 we have had this issue crop up from time to time and especially when invalidations were at play at the same time (that is why I asked about that). However, since it was relatively rare and solvable by a refresh, it has not been prioritized yet, unfortunately

Pete Lorenz

05/22/2025, 9:29 PM

This is not a rare or sporadic occurrence for us. We are seeing thousands of such events in our logs. We are seeing failures in reports where this is the only issue visible in the logs with the associated traceIds. We will follow up in a support request.

Radek Novacek

05/23/2025, 7:19 AM

Hi Pete, Radek from L2 Technical Support here - when you do open the support ticket, can you include any log examples of this happening you have, with the complete log context (following one traceId start to finish)? As Dan mentioned, since this is relatively rare, it's a little hard to catch - so the more we have, the easier it will be to get to the bottom of this. Many thanks! 🙂

Pete Lorenz

05/23/2025, 4:32 PM

Hi Radek, absolutely. I will include the full logs with our request.

2 Views

Open in Slack

Previous Next