Hi We are facing one more issue with GoodData version 3 24 1 GoodData #gooddata-cn

Hi, We are facing one more issue with GoodData ver...

Noushad Ali

06/27/2025, 10:23 AM

Hi, We are facing one more issue with GoodData version 3.24.1. After upgrading the Kubernetes node, 2 out of 3 etcd pods are keep restarting. The log is attached below. Please let us know how to address this issue.

log.txt

Michael Ullock

06/27/2025, 10:33 AM

Hi Noushad, hopefully my colleague @Robert Moucha may have some insight and can help with this 🙂

👍 1

Robert Moucha

06/27/2025, 11:21 AM

This may occasionally happen, older etcd chart versions are susceptible to this issue. Etcd is designed to keep data as long as the majority of pods (2 of 3, in your case) are running. It cannot survive loss of 2 or more pods. In case it already happened, please do the following: 1. make sure the env variable ETCD_INITIAL_CLUSTER_STATE is set to

new

in gooddata-cn-etcd StatefulSet. If it is set to

existing

, patch the StatefulSet and set it to

new

. 2. delete all 3 PVCs belonging to etcd:

kubectl -n YOUR-NAMESPACE delete pvc  -l <http://app.kubernetes.io/component=etcd|app.kubernetes.io/component=etcd> --wait=false

. The argument

--wait=true

is important. 3. delete all etcd pods at once:

kubectl -n YOUR-NAMESPACE delete pod -l <http://app.kubernetes.io/component=etcd|app.kubernetes.io/component=etcd>

Pods will be deleted, along with their PVCs, and then created, and resulting cluster will be operational. 4. patch etcd StatefulSet and set ETCD_INITIAL_CLUSTER_STATE to

existing

gratitude thank you 1

Robert Moucha

06/27/2025, 11:23 AM

New gooddata-cn helm chart (since version 3.37.0) uses newer etcd chart which has better cluster membership handling and is more stable than the old one. Consider upgrading.

Noushad Ali

06/27/2025, 11:25 AM

Thank you @Robert Moucha 🙂 we will try it and update you

Noushad Ali

06/27/2025, 11:27 AM

In step

are we supposed to

--wait=false

--wait=true

Robert Moucha

06/27/2025, 11:28 AM

Sorry, wait=false is correct

👍 1

Noushad Ali

06/28/2025, 3:51 AM

Thank you, the solution you suggested worked 🙂

✅ 1

7 Views

Open in Slack

Previous Next