Pete Lorenz
01/22/2024, 9:19 PM{
"ts": "2024-01-22 20:14:55.724",
"level": "ERROR",
"logger": "com.gooddata.tiger.grpc.client.resultcache.ResultCacheClient",
"thread": "DefaultDispatcher-worker-1",
"traceId": "2098d9d948e8cfb5",
"spanId": "2098d9d948e8cfb5",
"userId": "james.lee",
"orgId": "eb6f13c6-4b73-4a09-af89-adfb572d972e",
"tokenId": "",
"msg": "getResultCacheResponse",
"action": "grpcClientCall",
"exc": "kotlinx.coroutines.JobCancellationException: MonoCoroutine was cancelled; job=MonoCoroutine{Cancelling}@77b61a6c\n"
}
Repro steps:
1. Log into a particular tenant with LDM referencing our Snowflake data source
2. Click the "Analyze" tab, observe "Untitled insight" and a list of fields on the left
3. Drag one of the columns with data to the "ROWS" field in the new insight. In the display area, observe the "Computing" progress indicator
4. The operation appears to time out with the above error in the logs and 404 response from the following endpoint:
https://[hostname]/api/v1/actions/workspaces/b922ec833042440287f8e8837607fb6b/execution/afm/execute/result/6a0c80d11e875a869be6dd2f3c8b727b05046bb4?offset=0%2C0&limit=100%2C1000
The 404 response has the following content:
{
"title": "Not Found",
"status": 404,
"detail": "Result not found in result cache",
"resultId": "6a0c80d11e875a869be6dd2f3c8b727b05046bb4",
"traceId": "76ad754f13aaf80a"
}
Note: this procedure works with the same data source on our GCP instance of the same tenant / organization, but times out on our AWS instance of this tenant / organization. The insights were migrated from v2.3.2 on GCP to v3.1.0 on AWS using the AIO method provided in the above thread. The timeout occurs at exactly 3 minutes.
I'm attaching the logs for afm-exec-api and result-cache for the relevant time period. Please let us know if any additional info would be helpful to debug this issue. Thanks for any help.Pete Lorenz
01/22/2024, 9:30 PMJan Kos
01/23/2024, 1:53 PM6a0c80d11e875a869be6dd2f3c8b727b05046bb4
in this case) is generated by afm-exec-api during insight execution and it is registered into result cache.
If a result is already in result cache in it’ll get picked up immediately and isight is displayed in the UI. If result is not in the result cache sql queries are generated and executed against a datasource.
In the attached logs there is only the afm execution registering the result id into result cache and polling for result which ends up in 404 because it can’t be found after the said 3 minutes.
I would recommend to:
• for the sake of completeness clear datasource caches hitting uploadNotification API to clear caches
• Try to execute the insight in analytical designer again (clearing caches often times to resolve various issues) and you can replace /edit
part of URL by /debug
. It will download a zip file with afm explanation resources including sql file containing query which is being executed in datasource.
• You can try to execute the query directly in the datasource to see how long it takes to finish and/or compare it with the explain resource in your other deployment where it works
• It might be worth to check SQL_EXECUTOR pods logs if there are any timeouts on sql queries.Pete Lorenz
01/23/2024, 3:21 PMPete Lorenz
01/23/2024, 3:40 PM2024/01/23 15:33:22 [error] 38#38: *13805101 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.101.3.89, server: stg1-gooddata-cn-zwi2zjezyz.clearwateranalytics.com, request: "GET /api/v1/actions/workspaces/b922ec833042440287f8e8837607fb6b/execution/afm/execute/result/d1392dfba7cd1a12615cfedc61e82e315108039d?offset=0%2C0&limit=100%2C1000 HTTP/1.1", upstream: "<http://10.162.4.4:9000/api/v1/actions/workspaces/b922ec833042440287f8e8837607fb6b/execution/afm/execute/result/d1392dfba7cd1a12615cfedc61e82e315108039d?offset=0%2C0&limit=100%2C1000>", host: "stg1-gooddata-cn-zwi2zjezyz.clearwateranalytics.com", referrer: "<https://stg1-gooddata-cn-zwi2zjezyz.clearwateranalytics.com/analyze/>"
Trying with /debug but not seeing where I can download a zip file, nothing is downloaded, just getting 404 ...Pete Lorenz
01/23/2024, 4:55 PMPete Lorenz
01/23/2024, 5:15 PMJan Kos
01/23/2024, 5:32 PMPete Lorenz
01/23/2024, 5:36 PMPete Lorenz
01/23/2024, 5:44 PMPete Lorenz
04/01/2024, 3:53 PMPete Lorenz
04/01/2024, 4:45 PMPete Lorenz
04/01/2024, 10:26 PMJan Kos
04/02/2024, 12:22 PM