Greetings... I have a condition on <GoodData.CN> c...
# gooddata-cn
v
Greetings... I have a condition on GoodData.CN community edition, that is really confusing me
I have a data model with several levels... org, project, repository, contributor, with the commit object being at the center. commit -> repo -> project -> org
I have a page that queries activity in a week... so it fires off potentially 7 column charts at once
this query works fine when filtering by org Id, but it fails miserably when filtering by project, repo, or contributor
I am filtering normally on all those attributes in probably 20 or more other places
But this chart fails, by locking up the UI
and these messages are given in the GD server log
Copy code
skan-gd | ts="2022-08-20 13:57:38.756" level=WARN msg="Unable to send telemetry data" logger=com.gooddata.tiger.telemetry.MatomoTelemetryReporter thread=pool-5-thread-1 exc="java.util.concurrent.ExecutionException: java.net.ConnectException: Timeout connecting to [<http://matomo.anywhere.gooddata.com/3.69.247.149:443|matomo.anywhere.gooddata.com/3.69.247.149:443>]
skan-gd | 	at org.apache.http.concurrent.BasicFuture.getResult(BasicFuture.java:71)
skan-gd | 	at org.apache.http.concurrent.BasicFuture.get(BasicFuture.java:84)
skan-gd | 	at org.apache.http.impl.nio.client.FutureWrapper.get(FutureWrapper.java:70)
skan-gd | 	at com.gooddata.tiger.telemetry.MatomoTelemetryReporter.logError(MatomoTelemetryReporter.kt:262)
skan-gd | 	at com.gooddata.tiger.telemetry.MatomoTelemetryReporter.access$logError(MatomoTelemetryReporter.kt:25)
skan-gd | 	at com.gooddata.tiger.telemetry.MatomoTelemetryReporter$logError$1.invokeSuspend(MatomoTelemetryReporter.kt)
skan-gd | 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
skan-gd | 	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
skan-gd | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
skan-gd | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
skan-gd | 	at java.base/java.lang.Thread.run(Unknown Source)
skan-gd | Caused by: java.net.ConnectException: Timeout connecting to [<http://matomo.anywhere.gooddata.com/3.69.247.149:443|matomo.anywhere.gooddata.com/3.69.247.149:443>]
skan-gd | 	at org.apache.http.nio.pool.RouteSpecificPool.timeout(RouteSpecificPool.java:169)
skan-gd | 	at org.apache.http.nio.pool.AbstractNIOConnPool.requestTimeout(AbstractNIOConnPool.java:632)
skan-gd | 	at org.apache.http.nio.pool.AbstractNIOConnPool$InternalSessionRequestCallback.timeout(AbstractNIOConnPool.java:898)
skan-gd | 	at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:198)
skan-gd | 	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:213)
skan-gd | 	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:158)
skan-gd | 	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351)
skan-gd | 	at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221)
skan-gd | 	at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
skan-gd | 	... 1 more
Copy code
skan-gd | 172.28.0.1 - - [20/Aug/2022:13:57:29 +0000] "OPTIONS /api/v1/actions/workspaces/app/execution/afm/execute/result/e23ed2beb1cf72a29c7b12ce3f54bdf5240f3384 HTTP/1.1" 204 0 "<http://localhost:4200/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"
skan-gd | ts="2022-08-20 13:57:29.755" level=WARN msg="Unable to send telemetry data" logger=com.gooddata.tiger.telemetry.MatomoTelemetryReporter thread=pool-5-thread-1 exc="java.util.concurrent.ExecutionException: java.net.ConnectException: Timeout connecting to [<http://matomo.anywhere.gooddata.com/3.69.247.149:443|matomo.anywhere.gooddata.com/3.69.247.149:443>]
skan-gd | 	at org.apache.http.concurrent.BasicFuture.getResult(BasicFuture.java:71)
skan-gd | 	at org.apache.http.concurrent.BasicFuture.get(BasicFuture.java:84)
skan-gd | 	at org.apache.http.impl.nio.client.FutureWrapper.get(FutureWrapper.java:70)
skan-gd | 	at com.gooddata.tiger.telemetry.MatomoTelemetryReporter.logError(MatomoTelemetryReporter.kt:262)
skan-gd | 	at com.gooddata.tiger.telemetry.MatomoTelemetryReporter.access$logError(MatomoTelemetryReporter.kt:25)
skan-gd | 	at com.gooddata.tiger.telemetry.MatomoTelemetryReporter$logError$1.invokeSuspend(MatomoTelemetryReporter.kt)
skan-gd | 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
skan-gd | 	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
skan-gd | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
skan-gd | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
skan-gd | 	at java.base/java.lang.Thread.run(Unknown Source)
skan-gd | Caused by: java.net.ConnectException: Timeout connecting to [<http://matomo.anywhere.gooddata.com/3.69.247.149:443|matomo.anywhere.gooddata.com/3.69.247.149:443>]
skan-gd | 	at org.apache.http.nio.pool.RouteSpecificPool.timeout(RouteSpecificPool.java:169)
skan-gd | 	at org.apache.http.nio.pool.AbstractNIOConnPool.requestTimeout(AbstractNIOConnPool.java:632)
skan-gd | 	at org.apache.http.nio.pool.AbstractNIOConnPool$InternalSessionRequestCallback.timeout(AbstractNIOConnPool.java:898)
skan-gd | 	at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:198)
skan-gd | 	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:213)
skan-gd | 	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:158)
skan-gd | 	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351)
skan-gd | 	at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221)
skan-gd | 	at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
skan-gd | 	... 1 more
skan-gd | "
The messages in the server log scroll like there are too many requests made when requesting the project, repo, or contributor context, but it works fine with the org context
I am stumped
any insight would be appreciated
@hunger regnuh ^^
r
@Vincil Bishop Hi vincil, the error you see was related to telemetry component. There was a maintenance performed on our side a few hours ago that caused the telemetry service becoming unavailable. Now it should be OK, please try again. Sorry for incovenience.
On the other hand, the client-side telemetry component should fail fast when telemetry server is not accessible. Unfortunately this is not the case. I will report it to our dev team to avoid this condition next time.
v
But my question would be...why does it work when filtering for one attribute, but not for the others?
I just did the query again and the same result...is this something I would need to restart the server to correct?
r
Honestly, I don't know. Backend sends telemetry data asynchronously so it should not affect report computation if the telemetry service is unavailable. But there's also telemetry collector running in UI app (in browser). I don't have much info about its implementation, sorry. I will ask UI developers on Monday
v
appreciate the weekend response... this is not super super blocking... I'd like to get it working, now more just to understand what I did to break it so I can understand more about the capabilities of the system
In a way I don't think it's due to an external component being down... as the query works like clockwork in the org context, but no others
it's not like this is holding up production or anything
super weird and not something easy to troubleshoot
a moment will come soon when all will be revealed, hahaha
we always figure it out eventually
h
can’t you enable log trace and have it show the sql query?
r
Yes it is possible
v
yikes...like on the... gd postgres instance?
h
yep, then just run it natively in pgsql and trace the problem that way - i’m thinking there’s some kind of recursive lookup happening
v
possibly
but I cleaned up the LDM to remove any duplicate paths
it's all pretty straightforward now
it's like it kicks off too many requests, then a backend mechanism gets overloaded...then the timeouts start
r
Either on postgres, or you can start docker container with "-e APP_LOGLEVEL=INFO" (default is WARN) to emit the sql queries that are generated by Calcique component.
v
Got it... will try the enhanced log level
and this is on local machine BTW
in docker compose
r
v
but it also happens in the cloud
a few nice gems in the advanced config doc
r
We're going to write detailed article dedicated to performance, but it will be focused on helm chart version. gooddata-cn-ce (the single docker image) is a bit limited by the used technology.
v
sure
r
I plan to make some improvements, w.r.t. memory tuning for specific microservices.
currently you can only set up JVM parameters for all services altogether. But if you have plenty of RAM, you could give it a try.
v
I think it's model related
as it works on one context, organization
and not others... project, repo, contributor
r
I see, can you share the LDM screenshot? It looks like github stats or so
one of my colleagues just published an article on this topic: https://medium.com/gooddata-developers/how-to-build-a-modern-data-pipeline-cfdd9d14fbea
I can't identify any weak spot at the moment. when you increase the log level, you can grab the sqls from log. Alternatively, there's so-called "explainAFM" resource on our API https://www.gooddata.com/developers/cloud-native/doc/cloud/api-and-sdk/api/api_reference_all/#/actions/explainAFM If you are using Analyze tab, you can get the same relevant details on that page if you replace the "edit" part of URL with "debug". A zip file containing diagnostic information will be downloaded, including the SQL
v
thanks abunch... will give that a try and let you know