Hi all I am trying to upgrade to <http GD CN|GD CN> 2 3 2 fr GoodData #gooddata-cn

Hi all, I am trying to upgrade to <GD.CN> 2.3.2 fr...

Balamurali Ananthan

07/11/2023, 4:07 PM

Hi all, I am trying to upgrade to GD.CN 2.3.2 from GD.CN 2.2.0 in GCP. I am using Kubernetes 1.25 and the helm update command throws this error:

Copy code

helm upgrade  --install -n gooddata-cn gooddata-cn cwan/gooddata-cn $VALUES

20:35:32  Error: UPGRADE FAILED: failed to create patch: unable to find api field in struct Probe for the json field "grpc" && failed to create patch: unable to find api field in struct Probe for the json field "grpc"

How do I proceed with the installation? Thanks much

Francisco Antunes

07/11/2023, 5:06 PM

Hi Balamurali, if you don’t mind, would you kindly confirm that you have followed the steps outlined in the GoodData.CN Update Guide? Did you use the command outlined there for the upgrade?

Balamurali Ananthan

07/11/2023, 5:21 PM

Hi Francisco, yes we have followed the GD update guide and in the past we have successfully upgraded to GD 1.7.2, 2.1.0, 2.1.1 and 2.2.0.

Balamurali Ananthan

07/11/2023, 5:22 PM

While upgrading from 2.2.0 to 2.3.2 we encountered this error for the very first time. Is something specific needs to be done for 2.3.2 upgrade?

Francisco Antunes

07/11/2023, 5:25 PM

Have you tried updating from v2.2.0 to v2.3.0 first, and then from 2.3.0 > v2.3.1 > 2.3.2? 2.3.2 is a minor update, with only a few fixes, so it’s not possible to go from another major version (2.2) to it directly.

Francisco Antunes

07/11/2023, 5:43 PM

It’s explained by this bit of the upgrade guide. It was good that you brought it up, because we are looking into improving it to make it a bit clearer! 🙂

Balamurali Ananthan

07/11/2023, 5:46 PM

Ah, thats a good point. Let me try updating to 2.3.0 first. Thanks for the pointer!

🙏 1

Francisco Antunes

07/11/2023, 5:49 PM

I noticed you also sent us an email about this, I’ll send you a reply there but I’m monitoring both things, so let me know how it goes!

Balamurali Ananthan

07/11/2023, 5:51 PM

Great! Thanks Francisco.

Balamurali Ananthan

07/12/2023, 4:30 AM

Hi Francisco, I am getting the same error even while updating from 2.2.0 to 2.3.0. I added more details to the support ticket.

Robert Moucha

07/12/2023, 10:31 AM

The root cause is using

Copy code

livenessProbe:
  grpc:
    port: 6889

Robert Moucha

07/12/2023, 10:32 AM

but grpc probes are available in k8s api since 1.23

Robert Moucha

07/12/2023, 10:35 AM

can you please check if your cluster really supports this field?

Copy code

kubectl explain pod.spec.containers.livenessProbe.grpc

It should return a brief description from JSON API schema downloaded from your cluster

Robert Moucha

07/12/2023, 10:36 AM

If it returns something like

error: field "grpc" does not exist

then your cluster doesn't support this attribute

Robert Moucha

07/12/2023, 10:57 AM

please share the output of

kubectl version -o yaml

(notably the

serverVersion

part) and also

helm version

Balamurali Ananthan

07/12/2023, 2:07 PM

Thanks @Robert Moucha for your responses. Here are the relevant outputs:

Copy code

% kubectl explain pod.spec.containers.livenessProbe.grpc
KIND:     Pod
VERSION:  v1

RESOURCE: grpc <Object>

DESCRIPTION:
     GRPC specifies an action involving a GRPC port. This is a beta field and
     requires enabling GRPCContainerProbe feature gate.

FIELDS:
   port	<integer> -required-
     Port number of the gRPC service. Number must be in the range 1 to 65535.

   service	<string>
     Service is the name of the service to place in the gRPC HealthCheckRequest
     (see <https://github.com/grpc/grpc/blob/master/doc/health-checking.md>).

     If this is not specified, the default behavior is defined by gRPC.

Balamurali Ananthan

07/12/2023, 2:08 PM

Copy code

% kubectl version -o yaml
clientVersion:
  buildDate: "2021-07-15T21:04:39Z"
  compiler: gc
  gitCommit: ca643a4d1f7bfe34773c74f79527be4afd95bf39
  gitTreeState: clean
  gitVersion: v1.21.3
  goVersion: go1.16.6
  major: "1"
  minor: "21"
  platform: darwin/amd64
serverVersion:
  buildDate: "2023-03-30T14:01:58Z"
  compiler: gc
  gitCommit: 4ac1389d4c3eefaabbd9fa31782fcbfd72e6e6e6
  gitTreeState: clean
  gitVersion: v1.25.8-gke.1000
  goVersion: go1.19.7 X:boringcrypto
  major: "1"
  minor: "25"
  platform: linux/amd64

WARNING: version difference between client (1.21) and server (1.25) exceeds the supported minor version skew of +/-1

Balamurali Ananthan

07/12/2023, 2:08 PM

Copy code

% helm version
version.BuildInfo{Version:"v3.10.2", GitCommit:"50f003e5ee8704ec937a756c646870227d7c8b58", GitTreeState:"clean", GoVersion:"go1.19.3"}

Robert Moucha

07/12/2023, 2:55 PM

That's really strange - the cluster is really v1.25.8 and helm is also quite recent. Your

kubectl

is outdated - it may cause troubles when using it to access 1.25 clusters, but it is not the cause of your issue. The Kubernetes API reports it understands the

livenessProbe.grpc

objects (as expected, on 1.25).

Balamurali Ananthan

07/12/2023, 5:41 PM

Thanks @Robert Moucha, do you have any suggestions on how to move forward with this?

Robert Moucha

07/13/2023, 7:30 AM

I will try to reproduce your issue, because I don't see any obvious reason why the upgrade should fail. If you want to unblock your upgrade progress, you may download and modify the helm chart, until I come up with better solution. 1. download the chart using

helm pull gooddata/gooddata-cn --version 2.3.2 --untar -d gdcn-chart

2. edit file

gdcn-chart/gooddata-cn/templates/pdf-stapler-service/deployment.yaml

3. There are two conditions checking kubernetes if minor version is greater or equal to "24":

{{- if ge (int .Capabilities.KubeVersion.Minor) 24 }}

4. Raise the value

to some higher number, like

. These two lines should be:

{{- if ge (int .Capabilities.KubeVersion.Minor) *30* }}

5. These changes will bypass the offending

grpc

probe method and will use older, yet still working

exec

method. 6. Upgrade your helm release using this extracted chart:

Copy code

helm upgrade  --install -n gooddata-cn gooddata-cn ./gdcn-chart/gooddata-cn $VALUES

(the

./gdcn-chart

is the directory where you extracted helm chart). Sorry for the inconvenience, we'll need to investigate this problem deeper.

Balamurali Ananthan

07/13/2023, 6:24 PM

Thanks Robert. I see something similar in

charts/gooddata-cn-2.3.0/templates/tabular-exporter/deployment.yaml

. Do I change here as well?

Balamurali Ananthan

07/13/2023, 6:44 PM

Actually, I updated in tabular-exporter as well... but getting the same error related to the grpc \again. Is there another place where I have to raise the k8s minor version check to 30?

Robert Moucha

07/14/2023, 7:34 AM

Good point, the

grpc:

method is used also in tabular-exported. I didn't took this into account, because tabular-exporter didn't changed between 2.2.0 and 2.3.0 so there's no reason why it should stop suddenly working. Can you please perform the following:

helm upgrade  --install -n gooddata-cn gooddata-cn ./gdcn-chart/gooddata-cn *--dry-run* $VALUES > /tmp/manifests.yaml

And then, examine the output file /tmp/manifests.yaml and search for

grpc:

. There will be one occurrence in Secret

gooddata-cn-dex

- this is ok, it's a part of Dex config file. I'm particularly curious if there are any other

grpc:

methods in

livenessProbe

and

readinessProbe

objects and in which deployments.

Balamurali Ananthan

07/14/2023, 6:15 PM

Hi Robert, yes, I see grpc-health-probe under both livenessProbe and readinessProbe

Copy code

containers:
        - name: pdf-stapler-service
          securityContext:
            runAsUser: 1000
          image: "gooddata/pdf-stapler-service:2.3.0"
          imagePullPolicy: Always
          ports:
            - name: grpc
              containerPort: 6889
              protocol: TCP
            - name: actuator
              containerPort: 8287
              protocol: TCP
          livenessProbe:
            exec:
              command: ["/usr/local/bin/grpc-health-probe", "-addr=:6889"]
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 5
          readinessProbe:
            exec:
              command: ["/usr/local/bin/grpc-health-probe", "-addr=:6889"]

Balamurali Ananthan

07/14/2023, 6:16 PM

I see these under

pdf-stapler-service

and

tabular-exporter

containers

Balamurali Ananthan

07/14/2023, 6:16 PM

Could these be causing the installation errors?

Robert Moucha

07/15/2023, 3:47 PM

Copy code

exec:
              command: ["/usr/local/bin/grpc-health-probe", "-addr=:6889"]

These probes are of

exec:

type, not

grpc:

type. So the

helm ugprade

is actually trying to set

exec

and not

grpc

. I really don't know, where the

grpc

method comes from. Can you check the

pdf-stapler-service

Deployment directly on your cluster, what probe method is currently set? Chances are that if there's already

grpc:

, generated patch fails to replace this method by

exec:

for some unknown reason and the resulting object would have both exec and grpc methods set at once, but this is disallowed by k8s api. Do do following only in case these live deployments contain
grpc:
in some of probe definitions. You may fix it by manually editing these deployments.

Copy code

kubectl -n gooddata-cn edit deployment gooddata-cn-pdf-stapler-service  # ( or gooddata-cn-tabular-exporter)

An text editor opens up with deployment manifest. Find the

livenessProbe

and

readinessProbe

sections and delete them completely, including all nested fields. Save the file and exit editor. The modified deployment will restart its pods. Then, retry the

helm upgrade

. Another option, in case the previous attempt fails, you can add

--force

to the helm upgrade command line. This flag will make the whole deployment recreated by PUTting the full final manifest instead of simply PATCHing the supposed changes.

Robert Moucha

07/18/2023, 7:26 AM

@Balamurali Ananthan Did that help? Please let me know if you managed to upgrade gooddata-cn.

Balamurali Ananthan

07/18/2023, 2:07 PM

Will update you in a couple of hours @Robert Moucha

Balamurali Ananthan

07/18/2023, 4:02 PM

@Robert Moucha, I live edited the deployments like you suggested, deleted the

livenessProbe

and

readinessProbe

sections and saved it. Those modified deployments restarted its pods and that worked like a charm. I have updated Gooddata from 2.2.0 to 2.3.2 successfully in GCP. Thank you very much for all the detailed suggestions.

Balamurali Ananthan

07/18/2023, 4:04 PM

Consider writing and publishing a blog post on gooddata related to the above issue... 🙂 Might be helpful for other folks encountering this same issue.

Robert Moucha

07/18/2023, 5:25 PM

I'm glad it helped! I never faced a similar issue so I don't know how common it actually is.

👍 1

2 Views

Open in Slack

Previous Next