Our etcd statefulset is not starting up after a re...
# gooddata-cn
p
Our etcd statefulset is not starting up after a recent cluster upgrade. The logs have messages such as "Cluster not responding" and "cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs". The pods then enter Crashloopbackoff state and continue restarting. We tried restarting the statefulset with the command: "kubectl rollout restart statefulset/gooddata-cn-etcd -n gooddata-cn", but this did not help. This issue prevents insights from rendering in GoodData UI. I'm attaching the logs from one of the three failed pods. How can we get etcd back to a stable state?
r
Hi Pete, Radek from the Technical support team here - it's great to hear that you managed to get around the issue, but like you said, there may be a more refined way to go about this; I'll check and let you know!
Hi again! After looking into this, considering this is dealing with a 3rd party component packaged in with GD.CN, I'm not able to give any more specific advice on dealing with situations like these - however, I've submitted feedback to our engineering team on both your issue and the steps you took to resolve it, so that we can investigate how to make even bundled components more resilient to similar situations in the future.
p
Thank you, Radek. This makes sense.