Daniel Chýlek
09/09/2021, 5:22 PMRobert Moucha
09/10/2021, 7:22 AMJAVA_OPTS="-Xmx2g -XX:+UseStringDeduplication -XX:+ExplicitGCInvokesConcurrent -XX:+ParallelRefProcEnabled"
In the upcoming release, there will be configurable limit on amount of returned datacells to prevent possible OOM events.Daniel Chýlek
09/10/2021, 11:35 AMDaniel Chýlek
09/10/2021, 12:14 PM172.30.0.1 - - [10/Sep/2021:12:03:10 +0000] "GET /api/entities/workspaces/0fabe453644a413182f16a78c98778fd?metaInclude=config HTTP/1.1" 200 314 "<http://localhost:3000/analyze/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0"
172.30.0.1 - - [10/Sep/2021:12:03:10 +0000] "GET /api/entities/workspaces/0fabe453644a413182f16a78c98778fd/metrics?size=250&tags=&page=0 HTTP/1.1" 200 269 "<http://localhost:3000/analyze/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0"
172.30.0.1 - - [10/Sep/2021:12:03:10 +0000] "GET /api/entities/workspaces/0fabe453644a413182f16a78c98778fd/facts?size=250&tags=&page=0 HTTP/1.1" 200 532 "<http://localhost:3000/analyze/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0"
172.30.0.1 - - [10/Sep/2021:12:03:10 +0000] "GET /api/entities/workspaces/0fabe453644a413182f16a78c98778fd/attributes?size=250&include=labels&tags=&page=0 HTTP/1.1" 200 15346 "<http://localhost:3000/analyze/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0"
172.30.0.1 - - [10/Sep/2021:12:03:10 +0000] "GET /api/entities/workspaces/0fabe453644a413182f16a78c98778fd/attributes?size=250&include=labels%2Cdatasets&tags=&page=0 HTTP/1.1" 200 18635 "<http://localhost:3000/analyze/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0"
172.30.0.1 - - [10/Sep/2021:12:03:10 +0000] "GET /api/entities/workspaces/0fabe453644a413182f16a78c98778fd/attributes?size=250&include=labels%2Cdatasets&tags=&page=0 HTTP/1.1" 200 18635 "<http://localhost:3000/analyze/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0"
Warning: Nashorn engine is planned to be removed from a future JDK release
ts="2021-09-10 12:03:13.562" level=ERROR msg="Bad Request" logger=com.gooddata.tiger.web.exception.BaseExceptionHandling thread=DefaultDispatcher-worker-2 orgId=default spanId=7fece6104f81626d traceId=7fece6104f81626d userId=demo exc="errorType=com.gooddata.tiger.afm.tools.ResultCacheResponseError, message=An error has occurred during the listing of label elements
at com.gooddata.tiger.afm.service.ElementsProcessor.process$suspendImpl(ElementsProcessor.kt:67)
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
|_ checkpoint ⇢ Handler com.gooddata.tiger.afm.controller.ElementsController#processElementsRequest(String, String, ElementsOrder, boolean, boolean, String, int, int, float, boolean, ServerHttpRequest, Continuation) [DispatcherHandler]
Stack trace:
at com.gooddata.tiger.afm.service.ElementsProcessor.process$suspendImpl(ElementsProcessor.kt:67)
at com.gooddata.tiger.afm.service.ElementsProcessor$process$1.invokeSuspend(ElementsProcessor.kt)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTaskKt.resume(DispatchedTask.kt:175)
at kotlinx.coroutines.DispatchedTaskKt.resumeUnconfined(DispatchedTask.kt:137)
at kotlinx.coroutines.DispatchedTaskKt.dispatch(DispatchedTask.kt:108)
at kotlinx.coroutines.CancellableContinuationImpl.dispatchResume(CancellableContinuationImpl.kt:308)
at kotlinx.coroutines.CancellableContinuationImpl.resumeImpl(CancellableContinuationImpl.kt:318)
at kotlinx.coroutines.CancellableContinuationImpl.resumeWith(CancellableContinuationImpl.kt:250)
at kotlinx.coroutines.channels.AbstractChannel$ReceiveElement.resumeReceiveClosed(AbstractChannel.kt:877)
at kotlinx.coroutines.channels.AbstractSendChannel.helpClose(AbstractChannel.kt:312)
at kotlinx.coroutines.channels.AbstractSendChannel.close(AbstractChannel.kt:241)
at kotlinx.coroutines.channels.SendChannel$DefaultImpls.close$default(Channel.kt:102)
at kotlinx.coroutines.channels.ProducerCoroutine.onCompleted(Produce.kt:137)
at kotlinx.coroutines.channels.ProducerCoroutine.onCompleted(Produce.kt:130)
at kotlinx.coroutines.AbstractCoroutine.onCompletionInternal(AbstractCoroutine.kt:104)
at kotlinx.coroutines.JobSupport.tryFinalizeSimpleState(JobSupport.kt:294)
at kotlinx.coroutines.JobSupport.tryMakeCompleting(JobSupport.kt:853)
at kotlinx.coroutines.JobSupport.makeCompletingOnce$kotlinx_coroutines_core(JobSupport.kt:825)
at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:111)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
at kotlinx.coroutines.internal.ScopeCoroutine.afterResume(Scopes.kt:32)
at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:113)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:738)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
"
There are no other exceptions between "All services of GoodData.CN are ready" and this point.
The logical table model is generated just by scanning my testing database. The booking/value type tables have at most 3 rows and their labels are not loading. The product table has about 500 and its labels are loading fine. The sales table has ~1M which is 1-2 orders of magnitude less than what we usually expect it to have, and this one can load labels on one of the attributes.
Obviously we have no issue querying a table with 3 rows in it from pgAdmin, so I don't believe the problem is on the side of the database. Is there a way to enable more verbose logging to figure out what's going on, or anything else that can help diagnose the issue?Robert Moucha
09/12/2021, 10:52 PMdocker run -it --rm -v gd-volume:/data busybox
And then remove the directory /data/pulsar/standalone
in this container.
rm -rf /data/pulsar/standalone/
Exit the ephemeral container and start the GoodData.CN CE again, with this "fixed" volume. This is a temporary fix, as the OOM can happen again. The next release will make it more resilient.Daniel Chýlek
10/04/2021, 7:56 AMRobert Moucha
10/08/2021, 8:55 AM