Hello, we're currently on GoodData 3.21.0. We've n...
# gooddata-cn
p
Hello, we're currently on GoodData 3.21.0. We've noticed a cluster of error messages in the logs for the etcd component and would like to know the root cause and what we can do to resolve this issue, as it corresponds to an incident we encountered with etcd pods going down. The sequence of error messages is:
Copy code
removed remote peer
lost TCP streaming connection with remote peer
stopped stream reader with remote peer
lost TCP streaming connection with remote peer
rejected stream from remote peer because it was removed
rejected stream from remote peer because it was removed
rejected stream from remote peer because it was removed
rejected stream from remote peer because it was removed
stopped TCP streaming connection with remote peer
closed TCP streaming connection with remote peer
stopping remote peer
stopped stream reader with remote peer
stopped HTTP pipelining with remote peer
stopped TCP streaming connection with remote peer
closed TCP streaming connection with remote peer
failed to process Raft message
These errors are corresponding to an outage in etcd that brought down many GoodData pods. Please let us know what these errors mean and what we can do to mitigate this issue. Thank you 🙏
m
Hi Pete, our team will have a look into this for you and will get back to you with more details as soon as possible
👍 1
p
@Michael Ullock can you please raise a support request for this? The ticket portal does not appear to be working: https://support.gooddata.com/hc/en-us/requests/new?ticket_form_id=582387
I'm attaching the etcd logs that show unhealthiness starting around 2:22AM and etcd becoming unavailable at 3:08AM, bringing down other GoodData services.
I'm attaching a graph illustrating the course of the incident in OpenSearch:
j
Hello, I just wanted to confirm here that there is a ticket open for this issue. We have not noticed any problem with Support portal (we had other cases flowing in and we tested the form) but in future cases feel free to simply send an email to support@gooddata.com or give us a call if it is severity 1 for you. Thanks