Hi All We are doing load testing with GoodData cn our versio GoodData #gooddata-cn

Hi All, We are doing load testing with GoodData-cn...

Kshirod Mohanty

10/14/2022, 3:54 AM

Hi All, We are doing load testing with GoodData-cn, our version is 1.7.2. We are using GCP redis. I see loads of the below error in CalciqueGrpcService log. The error happened first time when we tried higher load(2x), but now the error is even happening with the smaller load(x) and its very frequent. What is the reason of this error? Also what is the possible actions we should take to resolve this error. I don't see any with redis though. {"exc":"org.springframework.dao.QueryTimeoutException: Redis command timed out; nested exception is io.lettuce.core.RedisCommandTimeoutException: Command timed out after 10 second(s) at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:70) at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41) at org.springframework.data.redis.connection.lettuce.LettuceReactiveRedisConnection.lambda$translateException$0(LettuceReactiveRedisConnection.java:293) at reactor.core.publisher.Flux.lambda$onErrorMap$28(Flux.java:6911) at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94) at reactor.core.publisher.MonoFlatMapMany$FlatMapManyInner.onError(MonoFlatMapMany.java:255) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onError(ScopePassingSpanSubscriber.java:95) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onError(FluxMapFuseable.java:140) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onError(FluxMapFuseable.java:140) at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192) at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onError(ScopePassingSpanSubscriber.java:95) at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93) at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93) at io.lettuce.core.RedisPublisher$ImmediateSubscriber.onError(RedisPublisher.java:891) at io.lettuce.core.RedisPublisher$State.onError(RedisPublisher.java:712) at io.lettuce.core.RedisPublisher$RedisSubscription.onError(RedisPublisher.java:357) at io.lettuce.core.RedisPublisher$SubscriptionCommand.onError(RedisPublisher.java:797) at io.lettuce.core.RedisPublisher$SubscriptionCommand.doOnError(RedisPublisher.java:793) at io.lettuce.core.protocol.CommandWrapper.completeExceptionally(CommandWrapper.java:128) at io.lettuce.core.protocol.CommandExpiryWriter.lambda$potentiallyExpire$0(CommandExpiryWriter.java:175) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Unknown Source) Caused by: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 10 second(s) at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:59) at io.lettuce.core.protocol.CommandExpiryWriter.lambda$potentiallyExpire$0(CommandExpiryWriter.java:176) ... 7 more ", "level":"ERROR", "logger":"com.gooddata.tiger.calcique.service.CalciqueGrpcService", "msg":"gRPC server call", "orgId":"xxxxxxxxxx", "spanId":"f47287a4344b2f3d", "thread":"DefaultDispatcher-worker-6", "traceId":"4c3ca0481448f410", "ts":"2022-10-14 031313.869", "userId":"admin"}

Kshirod Mohanty

10/14/2022, 5:27 AM

Is there any "TTL" for the keys written to the Redis?

Kshirod Mohanty

10/14/2022, 5:58 AM

This is the error in gooddata-cn-result-cache service. {"id":"878c2233f1a9fa09340d4652f5fe4e29", "level":"ERROR", "logger":"com.gooddata.tiger.cache.result.raw.service.RawCacheStore", "msg":"Store cache - unknown state of cache", "orgId":"<undefined>", "spanId":"784bd2542e41e8a1", "state":"NOT_FOUND", "thread":"DefaultDispatcher-worker-1", "traceId":"167288484439983d", "ts":"2022-10-14 055134.023", "userId":"<undefined>"}

Jan Soubusta

10/14/2022, 11:04 AM

@Ondrej Stumpf please, help

Ondrej Stumpf

10/14/2022, 11:35 AM

hi @Kshirod Mohanty, the fact that you see Redis exceptions in multiple services (calcique, result-cache, ...) suggests that there is a problem with Redis connectivity. The

unknown state of cache

might also suggest that Redis is overloaded and is evacuating records, which can cause inconsistency issues. Can you please double-check that Redis has enough memory? There indeed is TTL set for caches, but the actual value is different per key type, so there is no generic answer.

Robert Moucha

10/14/2022, 12:13 PM

Redis command timed out

redis responses are usually sub-millisecond. If the redis didn't respond to 10s, it suggests there are connectivity issues. Please check if the redis is accessible from the cluster.

Kshirod Mohanty

10/14/2022, 8:02 PM

We upgraded the GCP memorystore(redis) memory from 5GB to 50GB and flushed out all the keys. We are still seeing the errors for the higher loads. For smaller loads everything looks fine.

Robert Moucha

10/15/2022, 4:49 PM

It seems that we miss the important information in our documentation about redis configuration. The redis instance should have

maxmemory-policy=allkeys-lru

config key set. Please update your redis configuration - it should get the older keys evicted automatically when memory is full. I will add this setting to documentation.

Open in Slack

Previous Next