Hi all, I am trying to upgrade to <GD.CN> 2.3.2 fr...
# gooddata-cn
b
Hi all, I am trying to upgrade to GD.CN 2.3.2 from GD.CN 2.2.0 in GCP. I am using Kubernetes 1.25 and the helm update command throws this error:
Copy code
helm upgrade  --install -n gooddata-cn gooddata-cn cwan/gooddata-cn $VALUES

20:35:32  Error: UPGRADE FAILED: failed to create patch: unable to find api field in struct Probe for the json field "grpc" && failed to create patch: unable to find api field in struct Probe for the json field "grpc"
How do I proceed with the installation? Thanks much
f
Hi Balamurali, if you don’t mind, would you kindly confirm that you have followed the steps outlined in the GoodData.CN Update Guide? Did you use the command outlined there for the upgrade?
b
Hi Francisco, yes we have followed the GD update guide and in the past we have successfully upgraded to GD 1.7.2, 2.1.0, 2.1.1 and 2.2.0.
While upgrading from 2.2.0 to 2.3.2 we encountered this error for the very first time. Is something specific needs to be done for 2.3.2 upgrade?
f
Have you tried updating from v2.2.0 to v2.3.0 first, and then from 2.3.0 > v2.3.1 > 2.3.2? 2.3.2 is a minor update, with only a few fixes, so it’s not possible to go from another major version (2.2) to it directly.
It’s explained by this bit of the upgrade guide. It was good that you brought it up, because we are looking into improving it to make it a bit clearer! 🙂
b
Ah, thats a good point. Let me try updating to 2.3.0 first. Thanks for the pointer!
🙏 1
f
I noticed you also sent us an email about this, I’ll send you a reply there but I’m monitoring both things, so let me know how it goes!
b
Great! Thanks Francisco.
Hi Francisco, I am getting the same error even while updating from 2.2.0 to 2.3.0. I added more details to the support ticket.
r
The root cause is using
Copy code
livenessProbe:
  grpc:
    port: 6889
but grpc probes are available in k8s api since 1.23
can you please check if your cluster really supports this field?
Copy code
kubectl explain pod.spec.containers.livenessProbe.grpc
It should return a brief description from JSON API schema downloaded from your cluster
If it returns something like
error: field "grpc" does not exist
then your cluster doesn't support this attribute
please share the output of
kubectl version -o yaml
(notably the
serverVersion
part) and also
helm version
b
Thanks @Robert Moucha for your responses. Here are the relevant outputs:
Copy code
% kubectl explain pod.spec.containers.livenessProbe.grpc
KIND:     Pod
VERSION:  v1

RESOURCE: grpc <Object>

DESCRIPTION:
     GRPC specifies an action involving a GRPC port. This is a beta field and
     requires enabling GRPCContainerProbe feature gate.

FIELDS:
   port	<integer> -required-
     Port number of the gRPC service. Number must be in the range 1 to 65535.

   service	<string>
     Service is the name of the service to place in the gRPC HealthCheckRequest
     (see <https://github.com/grpc/grpc/blob/master/doc/health-checking.md>).

     If this is not specified, the default behavior is defined by gRPC.
Copy code
% kubectl version -o yaml
clientVersion:
  buildDate: "2021-07-15T21:04:39Z"
  compiler: gc
  gitCommit: ca643a4d1f7bfe34773c74f79527be4afd95bf39
  gitTreeState: clean
  gitVersion: v1.21.3
  goVersion: go1.16.6
  major: "1"
  minor: "21"
  platform: darwin/amd64
serverVersion:
  buildDate: "2023-03-30T14:01:58Z"
  compiler: gc
  gitCommit: 4ac1389d4c3eefaabbd9fa31782fcbfd72e6e6e6
  gitTreeState: clean
  gitVersion: v1.25.8-gke.1000
  goVersion: go1.19.7 X:boringcrypto
  major: "1"
  minor: "25"
  platform: linux/amd64

WARNING: version difference between client (1.21) and server (1.25) exceeds the supported minor version skew of +/-1
Copy code
% helm version
version.BuildInfo{Version:"v3.10.2", GitCommit:"50f003e5ee8704ec937a756c646870227d7c8b58", GitTreeState:"clean", GoVersion:"go1.19.3"}
r
That's really strange - the cluster is really v1.25.8 and helm is also quite recent. Your
kubectl
is outdated - it may cause troubles when using it to access 1.25 clusters, but it is not the cause of your issue. The Kubernetes API reports it understands the
livenessProbe.grpc
objects (as expected, on 1.25).
b
Thanks @Robert Moucha, do you have any suggestions on how to move forward with this?
r
I will try to reproduce your issue, because I don't see any obvious reason why the upgrade should fail. If you want to unblock your upgrade progress, you may download and modify the helm chart, until I come up with better solution. 1. download the chart using
helm pull gooddata/gooddata-cn --version 2.3.2 --untar -d gdcn-chart
2. edit file
gdcn-chart/gooddata-cn/templates/pdf-stapler-service/deployment.yaml
3. There are two conditions checking kubernetes if minor version is greater or equal to "24":
{{- if ge (int .Capabilities.KubeVersion.Minor) 24 }}
4. Raise the value
24
to some higher number, like
30
. These two lines should be:
{{- if ge (int .Capabilities.KubeVersion.Minor) *30* }}
5. These changes will bypass the offending
grpc
probe method and will use older, yet still working
exec
method. 6. Upgrade your helm release using this extracted chart:
Copy code
helm upgrade  --install -n gooddata-cn gooddata-cn ./gdcn-chart/gooddata-cn $VALUES
(the
./gdcn-chart
is the directory where you extracted helm chart). Sorry for the inconvenience, we'll need to investigate this problem deeper.
b
Thanks Robert. I see something similar in
charts/gooddata-cn-2.3.0/templates/tabular-exporter/deployment.yaml
. Do I change here as well?
Actually, I updated in tabular-exporter as well... but getting the same error related to the grpc \again. Is there another place where I have to raise the k8s minor version check to 30?
r
Good point, the
grpc:
method is used also in tabular-exported. I didn't took this into account, because tabular-exporter didn't changed between 2.2.0 and 2.3.0 so there's no reason why it should stop suddenly working. Can you please perform the following:
helm upgrade  --install -n gooddata-cn gooddata-cn ./gdcn-chart/gooddata-cn *--dry-run* $VALUES > /tmp/manifests.yaml
And then, examine the output file /tmp/manifests.yaml and search for
grpc:
. There will be one occurrence in Secret
gooddata-cn-dex
- this is ok, it's a part of Dex config file. I'm particularly curious if there are any other
grpc:
methods in
livenessProbe
and
readinessProbe
objects and in which deployments.
b
Hi Robert, yes, I see grpc-health-probe under both livenessProbe and readinessProbe
Copy code
containers:
        - name: pdf-stapler-service
          securityContext:
            runAsUser: 1000
          image: "gooddata/pdf-stapler-service:2.3.0"
          imagePullPolicy: Always
          ports:
            - name: grpc
              containerPort: 6889
              protocol: TCP
            - name: actuator
              containerPort: 8287
              protocol: TCP
          livenessProbe:
            exec:
              command: ["/usr/local/bin/grpc-health-probe", "-addr=:6889"]
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 5
          readinessProbe:
            exec:
              command: ["/usr/local/bin/grpc-health-probe", "-addr=:6889"]
I see these under
pdf-stapler-service
and
tabular-exporter
containers
Could these be causing the installation errors?
r
Copy code
exec:
              command: ["/usr/local/bin/grpc-health-probe", "-addr=:6889"]
These probes are of
exec:
type, not
grpc:
type. So the
helm ugprade
is actually trying to set
exec
and not
grpc
. I really don't know, where the
grpc
method comes from. Can you check the
pdf-stapler-service
Deployment directly on your cluster, what probe method is currently set? Chances are that if there's already
grpc:
, generated patch fails to replace this method by
exec:
for some unknown reason and the resulting object would have both exec and grpc methods set at once, but this is disallowed by k8s api. Do do following only in case these live deployments contain
grpc:
in some of probe definitions
. You may fix it by manually editing these deployments.
Copy code
kubectl -n gooddata-cn edit deployment gooddata-cn-pdf-stapler-service  # ( or gooddata-cn-tabular-exporter)
An text editor opens up with deployment manifest. Find the
livenessProbe
and
readinessProbe
sections and delete them completely, including all nested fields. Save the file and exit editor. The modified deployment will restart its pods. Then, retry the
helm upgrade
. Another option, in case the previous attempt fails, you can add
--force
to the helm upgrade command line. This flag will make the whole deployment recreated by PUTting the full final manifest instead of simply PATCHing the supposed changes.
@Balamurali Ananthan Did that help? Please let me know if you managed to upgrade gooddata-cn.
b
Will update you in a couple of hours @Robert Moucha
@Robert Moucha, I live edited the deployments like you suggested, deleted the
livenessProbe
and
readinessProbe
sections and saved it. Those modified deployments restarted its pods and that worked like a charm. I have updated Gooddata from 2.2.0 to 2.3.2 successfully in GCP. Thank you very much for all the detailed suggestions.
Consider writing and publishing a blog post on gooddata related to the above issue... 🙂 Might be helpful for other folks encountering this same issue.
r
I'm glad it helped! I never faced a similar issue so I don't know how common it actually is.
👍 1