Hi I m trying to install Kubernetes and GD on a single machi GoodData #gooddata-cn

Hi, I'm trying to install Kubernetes and GD on a s...

Daniel Chýlek

10/21/2021, 5:46 AM

Hi, I'm trying to install Kubernetes and GD on a single machine with a fresh installation of Ubuntu Server. I'm using Multipass to run the workers, but if there is a better alternative for running on a single machine, or a way to run GD with no additional servers, please let me know. Is there any beginner-friendly guide that can guide me through the whole installation process and configuration needed for both the OS and Kubernetes? I tried following some guides, but I'm constantly running into problems the guide doesn't cover... I need something that starts with a fresh OS installation, goes step-by-step, and actually works in the end. Thanks.

Martin Svadlenka

10/21/2021, 2:47 PM

Hi Daniel, I will refer you to @Milan Sladký here. He is our deployment specialist.

Milan Sladký

10/22/2021, 6:10 AM

Hi Daniel, we do not provide any guide how to setup OS and Kubernetes as it is fairly complicated and complex process. However, if you want to get Kubernetes running easily on a single machine you can go with https://k3d.io/. You just need the docker.

Daniel Chýlek

10/28/2021, 10:45 PM

Hi, I managed to get k3d running, but now when I install GD.CN I get an error without enough to details to understand why it's failing.

Copy code

chylek@ubuntu:~$ helm install --version 1.4.0 --namespace gooddata-cn --wait --debug -f customized-values-gooddata-cn.yaml gooddata-cn gooddata/gooddata-cn
install.go:178: [debug] Original chart version: "1.4.0"
install.go:199: [debug] CHART PATH: /home/chylek/.cache/helm/repository/gooddata-cn-1.4.0.tgz

client.go:128: [debug] creating 1 resource(s)
client.go:128: [debug] creating 1 resource(s)
install.go:165: [debug] Clearing discovery cache
wait.go:48: [debug] beginning wait for 2 resources with timeout of 1m0s
W1028 22:30:06.465847   19267 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
client.go:299: [debug] Starting delete for "gooddata-cn-create-namespace" Job
client.go:328: [debug] jobs.batch "gooddata-cn-create-namespace" not found
client.go:128: [debug] creating 1 resource(s)
client.go:528: [debug] Watching for changes to Job gooddata-cn-create-namespace with timeout of 5m0s
client.go:556: [debug] Add/Modify event for gooddata-cn-create-namespace: ADDED
client.go:595: [debug] gooddata-cn-create-namespace: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:556: [debug] Add/Modify event for gooddata-cn-create-namespace: MODIFIED
client.go:595: [debug] gooddata-cn-create-namespace: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition
helm.go:88: [debug] failed pre-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
	<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>
<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
	<http://github.com/spf13/cobra@v1.2.1/command.go:856|github.com/spf13/cobra@v1.2.1/command.go:856>
<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
	<http://github.com/spf13/cobra@v1.2.1/command.go:974|github.com/spf13/cobra@v1.2.1/command.go:974>
<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
	<http://github.com/spf13/cobra@v1.2.1/command.go:902|github.com/spf13/cobra@v1.2.1/command.go:902>
main.main
	<http://helm.sh/helm/v3/cmd/helm/helm.go:87|helm.sh/helm/v3/cmd/helm/helm.go:87>
runtime.main
	runtime/proc.go:225
runtime.goexit
	runtime/asm_amd64.s:1371
chylek@ubuntu:~$ kubectl get nodes
NAME                    STATUS   ROLES                  AGE   VERSION
k3d-gooddata-agent-0    Ready    <none>                 59m   v1.21.5+k3s2
k3d-gooddata-agent-2    Ready    <none>                 59m   v1.21.5+k3s2
k3d-gooddata-agent-1    Ready    <none>                 59m   v1.21.5+k3s2
k3d-gooddata-server-0   Ready    control-plane,master   59m   v1.21.5+k3s2

Daniel Chýlek

10/28/2021, 10:49 PM

It's very likely I missed some part of the installation process, but even with --debug it doesn't say what it's waiting for, so I don't know what's missing.

Albert Kristof

11/01/2021, 2:11 PM

@Milan Sladký Could you please consult with Daniel?

Milan Sladký

11/01/2021, 2:16 PM

@Daniel Chýlek Have you installed Pulsar chart correctly? It is this section in the doc https://www.gooddata.com/developers/cloud-native/doc/1.4/installation/k8s/helm-chart-installation/#use-customized-valuesyaml-for-pulsar

Daniel Chýlek

11/02/2021, 8:24 AM

I installed pulsar with these values, with storageclass set according to:

Copy code

~$ kubectl get storageclass
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   <http://rancher.io/local-path|rancher.io/local-path>   Delete          WaitForFirstConsumer   false                  4d10h

I tried reinstalling it just in case:

Copy code

chylek@ubuntu:~$ helm upgrade --install --namespace pulsar --version 2.7.2     -f customized-values-pulsar.yaml --set initialize=true     pulsar apache/pulsar
W1102 08:22:35.920845   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.922540   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.928066   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.929502   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.931054   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.933140   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.934550   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.935975   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.938279   21249 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1102 08:22:35.961397   21249 warnings.go:70] <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1> ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1> ClusterRole
W1102 08:22:35.962992   21249 warnings.go:70] <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1> ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1> ClusterRole
W1102 08:22:35.964974   21249 warnings.go:70] <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1> ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1> ClusterRole
Release "pulsar" has been upgraded. Happy Helming!
NAME: pulsar
LAST DEPLOYED: Tue Nov  2 08:22:35 2021
NAMESPACE: pulsar
STATUS: deployed
REVISION: 2
TEST SUITE: None

Daniel Chýlek

11/02/2021, 8:27 AM

If I revert to a snapshot before installing GD, helm says pulsar was deployed:

Copy code

chylek@ubuntu:~$ helm list --all-namespaces
NAME         	NAMESPACE    	REVISION	UPDATED                                	STATUS  	CHART              	APP VERSION
ingress-nginx	ingress-nginx	1       	2021-10-28 22:24:57.604897327 +0000 UTC	deployed	ingress-nginx-4.0.6	1.0.4      
pulsar       	pulsar       	1       	2021-10-28 22:27:47.400520567 +0000 UTC	deployed	pulsar-2.7.2       	2.7.2      
traefik      	kube-system  	1       	2021-10-28 21:44:02.503732749 +0000 UTC	deployed	traefik-9.18.2     	2.4.8      
traefik-crd  	kube-system  	1       	2021-10-28 21:44:01.540122164 +0000 UTC	deployed	traefik-crd-9.18.2

Milan Sladký

11/02/2021, 12:12 PM

ok, can you plese send here output of

kubectl get pods -A

Daniel Chýlek

11/02/2021, 12:33 PM

Copy code

chylek@ubuntu:~$ kubectl get pods -A
NAMESPACE       NAME                                        READY   STATUS      RESTARTS   AGE
kube-system     helm-install-traefik-crd-6m762              0/1     Completed   0          4d14h
kube-system     helm-install-traefik-ltdsg                  0/1     Completed   1          4d14h
ingress-nginx   svclb-ingress-nginx-controller-99jmv        0/2     Pending     0          4d14h
ingress-nginx   svclb-ingress-nginx-controller-pphnj        0/2     Pending     0          4d14h
ingress-nginx   svclb-ingress-nginx-controller-vdr6q        0/2     Pending     0          4d14h
ingress-nginx   svclb-ingress-nginx-controller-r9lmt        0/2     Pending     0          4d14h
pulsar          pulsar-bookie-init-hkjm4                    0/1     Completed   0          4d14h
pulsar          pulsar-pulsar-init-sqmwt                    0/1     Completed   0          4d14h
kube-system     svclb-traefik-vqd9b                         2/2     Running     4          4d14h
kube-system     local-path-provisioner-5ff76fc89d-grqwm     1/1     Running     6          4d14h
kube-system     svclb-traefik-vddkf                         2/2     Running     4          4d14h
kube-system     svclb-traefik-f6bm7                         2/2     Running     4          4d14h
kube-system     metrics-server-86cbb8457f-bsgc2             1/1     Running     2          4d14h
kube-system     coredns-7448499f4d-6p875                    1/1     Running     2          4d14h
kube-system     svclb-traefik-9jl5x                         2/2     Running     4          4d14h
ingress-nginx   ingress-nginx-controller-5c8d66c76d-zdld5   1/1     Running     1          4d14h
pulsar          pulsar-zookeeper-2                          1/1     Running     1          3h37m
kube-system     traefik-97b44b794-rjwl6                     1/1     Running     2          4d14h
pulsar          pulsar-recovery-0                           1/1     Running     1          4d14h
ingress-nginx   ingress-nginx-controller-5c8d66c76d-4qckr   1/1     Running     1          4d14h
pulsar          pulsar-zookeeper-1                          1/1     Running     1          3h37m
pulsar          pulsar-zookeeper-0                          1/1     Running     1          4d14h
pulsar          pulsar-bookie-1                             1/1     Running     1          4d14h
pulsar          pulsar-bookie-0                             1/1     Running     1          4d14h
pulsar          pulsar-broker-0                             1/1     Running     1          4d14h
pulsar          pulsar-broker-1                             1/1     Running     1          4d14h
pulsar          pulsar-bookie-2                             1/1     Running     1          4d14h

Daniel Chýlek

11/02/2021, 12:33 PM

This is again from the snapshot before installing GD, I will try installing it again and post if anything has changed.

Daniel Chýlek

11/02/2021, 12:50 PM

Latest installation attempt:

Copy code

chylek@ubuntu:~$ helm install --version 1.4.0 --namespace gooddata-cn --wait \
>  --debug  -f customized-values-gooddata-cn.yaml gooddata-cn gooddata/gooddata-cn
install.go:178: [debug] Original chart version: "1.4.0"
install.go:199: [debug] CHART PATH: /home/chylek/.cache/helm/repository/gooddata-cn-1.4.0.tgz

client.go:128: [debug] creating 1 resource(s)
client.go:128: [debug] creating 1 resource(s)
install.go:165: [debug] Clearing discovery cache
wait.go:48: [debug] beginning wait for 2 resources with timeout of 1m0s
W1102 12:34:33.481433   19089 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
client.go:299: [debug] Starting delete for "gooddata-cn-create-namespace" Job
client.go:328: [debug] jobs.batch "gooddata-cn-create-namespace" not found
client.go:128: [debug] creating 1 resource(s)
client.go:528: [debug] Watching for changes to Job gooddata-cn-create-namespace with timeout of 5m0s
client.go:556: [debug] Add/Modify event for gooddata-cn-create-namespace: ADDED
client.go:595: [debug] gooddata-cn-create-namespace: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:556: [debug] Add/Modify event for gooddata-cn-create-namespace: MODIFIED
client.go:595: [debug] gooddata-cn-create-namespace: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:556: [debug] Add/Modify event for gooddata-cn-create-namespace: MODIFIED
client.go:299: [debug] Starting delete for "gooddata-cn-create-namespace" Job
client.go:128: [debug] creating 62 resource(s)
W1102 12:34:43.123427   19089 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
wait.go:48: [debug] beginning wait for 62 resources with timeout of 5m0s
ready.go:277: [debug] Deployment is not ready: gooddata-cn/gooddata-cn-db-pgpool. 0 out of 2 expected pods are ready

(the last line repeated many times)

I1102 12:38:13.249341   19089 request.go:665] Waited for 10.126477116s due to client-side throttling, not priority and fairness, request: GET:<https://0.0.0.0:33355/api/v1/namespaces/gooddata-cn/services/gooddata-cn-metadata-api>
ready.go:277: [debug] Deployment is not ready: gooddata-cn/gooddata-cn-db-pgpool. 0 out of 2 expected pods are ready

(more repeats)

Error: INSTALLATION FAILED: timed out waiting for the condition
helm.go:88: [debug] timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
	<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>
<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
	<http://github.com/spf13/cobra@v1.2.1/command.go:856|github.com/spf13/cobra@v1.2.1/command.go:856>
<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
	<http://github.com/spf13/cobra@v1.2.1/command.go:974|github.com/spf13/cobra@v1.2.1/command.go:974>
<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
	<http://github.com/spf13/cobra@v1.2.1/command.go:902|github.com/spf13/cobra@v1.2.1/command.go:902>
main.main
	<http://helm.sh/helm/v3/cmd/helm/helm.go:87|helm.sh/helm/v3/cmd/helm/helm.go:87>
runtime.main
	runtime/proc.go:225
runtime.goexit
	runtime/asm_amd64.s:1371

Copy code

chylek@ubuntu:~$ kubectl get pods -A
NAMESPACE       NAME                                                   READY   STATUS                  RESTARTS   AGE
kube-system     helm-install-traefik-crd-6m762                         0/1     Completed               0          4d15h
kube-system     helm-install-traefik-ltdsg                             0/1     Completed               1          4d15h
ingress-nginx   svclb-ingress-nginx-controller-99jmv                   0/2     Pending                 0          4d14h
ingress-nginx   svclb-ingress-nginx-controller-pphnj                   0/2     Pending                 0          4d14h
ingress-nginx   svclb-ingress-nginx-controller-vdr6q                   0/2     Pending                 0          4d14h
ingress-nginx   svclb-ingress-nginx-controller-r9lmt                   0/2     Pending                 0          4d14h
pulsar          pulsar-bookie-init-hkjm4                               0/1     Completed               0          4d14h
pulsar          pulsar-pulsar-init-sqmwt                               0/1     Completed               0          4d14h
kube-system     svclb-traefik-vqd9b                                    2/2     Running                 4          4d15h
kube-system     local-path-provisioner-5ff76fc89d-grqwm                1/1     Running                 6          4d15h
kube-system     svclb-traefik-vddkf                                    2/2     Running                 4          4d15h
kube-system     svclb-traefik-f6bm7                                    2/2     Running                 4          4d15h
kube-system     metrics-server-86cbb8457f-bsgc2                        1/1     Running                 2          4d15h
kube-system     coredns-7448499f4d-6p875                               1/1     Running                 2          4d15h
kube-system     svclb-traefik-9jl5x                                    2/2     Running                 4          4d15h
ingress-nginx   ingress-nginx-controller-5c8d66c76d-zdld5              1/1     Running                 1          4d14h
pulsar          pulsar-zookeeper-2                                     1/1     Running                 1          3h54m
kube-system     traefik-97b44b794-rjwl6                                1/1     Running                 2          4d15h
pulsar          pulsar-recovery-0                                      1/1     Running                 1          4d14h
ingress-nginx   ingress-nginx-controller-5c8d66c76d-4qckr              1/1     Running                 1          4d14h
pulsar          pulsar-zookeeper-1                                     1/1     Running                 1          3h55m
pulsar          pulsar-zookeeper-0                                     1/1     Running                 1          4d14h
pulsar          pulsar-bookie-1                                        1/1     Running                 1          4d14h
pulsar          pulsar-bookie-0                                        1/1     Running                 1          4d14h
pulsar          pulsar-broker-0                                        1/1     Running                 1          4d14h
pulsar          pulsar-broker-1                                        1/1     Running                 1          4d14h
pulsar          pulsar-bookie-2                                        1/1     Running                 1          4d14h
gooddata-cn     gooddata-cn-result-cache-85558b84fb-rjws8              0/1     ContainerCreating       0          15m
gooddata-cn     gooddata-cn-metadata-api-7b94f9778d-2w66d              0/1     Init:ImagePullBackOff   0          15m
gooddata-cn     gooddata-cn-metadata-api-7b94f9778d-gn8w8              0/1     Init:ImagePullBackOff   0          15m
gooddata-cn     gooddata-cn-sql-executor-69fd9f559f-42g6n              0/1     Init:ImagePullBackOff   0          15m
gooddata-cn     gooddata-cn-scan-model-6dc5cfb9dc-b8tk7                0/1     ImagePullBackOff        0          15m
gooddata-cn     gooddata-cn-dex-5c985fbf98-hxt9c                       0/1     Init:ImagePullBackOff   0          15m
gooddata-cn     gooddata-cn-scan-model-6dc5cfb9dc-v4pph                0/1     ImagePullBackOff        0          15m
gooddata-cn     gooddata-cn-auth-service-6f487cfbff-qgngc              0/1     ImagePullBackOff        0          15m
gooddata-cn     gooddata-cn-sql-executor-69fd9f559f-nxqms              0/1     Init:ImagePullBackOff   0          15m
gooddata-cn     gooddata-cn-dex-5c985fbf98-kb6m6                       0/1     Init:ImagePullBackOff   0          15m
gooddata-cn     gooddata-cn-afm-exec-api-f4dc5dbfd-78xkh               0/1     ImagePullBackOff        0          15m
gooddata-cn     gooddata-cn-measure-editor-7c4f79d5d-47bpj             1/1     Running                 0          15m
gooddata-cn     gooddata-cn-measure-editor-7c4f79d5d-qk4tn             1/1     Running                 0          15m
gooddata-cn     gooddata-cn-auth-service-6f487cfbff-j4mfs              0/1     ImagePullBackOff        0          15m
gooddata-cn     gooddata-cn-redis-ha-server-0                          3/3     Running                 0          15m
gooddata-cn     gooddata-cn-afm-exec-api-f4dc5dbfd-8gmxr               0/1     ImagePullBackOff        0          15m
gooddata-cn     gooddata-cn-apidocs-67686f7694-st76j                   1/1     Running                 0          15m
gooddata-cn     gooddata-cn-ldm-modeler-5b778b998d-ckh8m               1/1     Running                 0          15m
gooddata-cn     gooddata-cn-dashboards-57d48bbc84-292b6                1/1     Running                 0          15m
gooddata-cn     gooddata-cn-dashboards-57d48bbc84-zscdl                1/1     Running                 0          15m
gooddata-cn     gooddata-cn-ldm-modeler-5b778b998d-h4dhv               1/1     Running                 0          15m
gooddata-cn     gooddata-cn-home-ui-759fcf7d4-5g868                    1/1     Running                 0          15m
gooddata-cn     gooddata-cn-home-ui-759fcf7d4-g2l92                    1/1     Running                 0          15m
gooddata-cn     gooddata-cn-aqe-5d9b68f586-kk2t7                       1/1     Running                 0          15m
gooddata-cn     gooddata-cn-analytical-designer-54646887df-s88n8       1/1     Running                 0          15m
gooddata-cn     gooddata-cn-apidocs-67686f7694-dkqlb                   1/1     Running                 0          15m
gooddata-cn     gooddata-cn-result-cache-85558b84fb-hpczg              0/1     ImagePullBackOff        0          15m
gooddata-cn     gooddata-cn-redis-ha-server-1                          3/3     Running                 0          9m39s
gooddata-cn     gooddata-cn-db-postgresql-1                            0/2     PodInitializing         0          15m
gooddata-cn     gooddata-cn-db-postgresql-0                            0/2     PodInitializing         0          15m
gooddata-cn     gooddata-cn-organization-controller-67c7d99d55-8znr9   1/1     Running                 0          15m
gooddata-cn     gooddata-cn-organization-controller-67c7d99d55-ktn8l   1/1     Running                 0          15m
gooddata-cn     gooddata-cn-analytical-designer-54646887df-5d9pl       1/1     Running                 0          15m
gooddata-cn     gooddata-cn-redis-ha-server-2                          3/3     Running                 0          5m49s
gooddata-cn     gooddata-cn-aqe-5d9b68f586-brtnc                       1/1     Running                 0          15m
gooddata-cn     gooddata-cn-tools-7dd9c565d9-7srq2                     1/1     Running                 0          15m
gooddata-cn     gooddata-cn-db-pgpool-84fc646558-jzshj                 0/1     Running                 4          15m
gooddata-cn     gooddata-cn-db-pgpool-84fc646558-dnsnj                 1/1     Running                 4          15m

Robert Moucha

11/02/2021, 6:00 PM

@Daniel Chýlek I prepared a simple script that should install gooddata.cn in k3d: https://github.com/mouchar/gooddata-cn-tools/tree/master/k3d

👏 3

Robert Moucha

11/02/2021, 6:03 PM

Please let me know if it works for you. It's been developed and tested on Ubuntu, but on other Linux distros it should work as well. Not sure about OSX or Windows/WSL

Daniel Chýlek

11/02/2021, 8:09 PM

Thank you, It appears in k3d 5.0 there is no

--k3s-server-arg

. Changing it to

--k3s-arg

gave me another error:

Copy code

FATA[0000] K3sExtraArg '--no-deploy=traefik' lacks a node filter, but there's more than one node

I ended up removing the argument altogether, which is probably a bad idea. (*) The script ended with

Copy code

Running: helm -n gooddata upgrade --install gooddata-cn --wait --timeout 7m --values /tmp/values-gooddata-cn.yaml --version 1.4.0 gooddata/gooddata-cn
Release "gooddata-cn" does not exist. Installing it now.
W1102 19:33:54.366190   93421 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
W1102 19:34:04.959222   93421 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
Error: context deadline exceeded

It seems most services are running, but both :80 and :443 respond with a 404 error.

Copy code

chylek@ubuntu:~$ kubectl get pods -A
NAMESPACE      NAME                                                  READY   STATUS              RESTARTS   AGE
kube-system    helm-install-traefik-crd-wjwd4                        0/1     Completed           0          72m
kube-system    helm-install-cert-manager-9c8zw                       0/1     Completed           0          72m
kube-system    helm-install-traefik-2lln4                            0/1     Completed           0          72m
kube-system    svclb-ingress-nginx-controller-r8tq2                  0/2     Pending             0          68m
kube-system    svclb-ingress-nginx-controller-qb9hh                  0/2     Pending             0          68m
kube-system    svclb-ingress-nginx-controller-z2k4r                  0/2     Pending             0          68m
kube-system    helm-install-ingress-nginx-qggz7                      0/1     Completed           0          72m
pulsar         pulsar-bookie-init-jt59f                              0/1     Completed           0          33m
pulsar         pulsar-pulsar-init-scbsw                              0/1     Completed           0          33m
gooddata       gooddata-cn-apidocs-594d6dbf44-h2dp6                  1/1     Running             1          26m
kube-system    svclb-traefik-9tw5p                                   2/2     Running             2          69m
kube-system    ingress-nginx-controller-6d64b8fb47-wm2jq             1/1     Running             1          68m
kube-system    metrics-server-86cbb8457f-k7cmk                       1/1     Running             2          72m
pulsar         pulsar-zookeeper-2                                    1/1     Running             1          27m
gooddata       gooddata-cn-redis-ha-server-0                         3/3     Running             3          26m
gooddata       gooddata-cn-auth-service-8654cbff5d-t54xj             1/1     Running             1          26m
gooddata       gooddata-cn-db-postgresql-0                           2/2     Running             2          26m
gooddata       gooddata-cn-tools-67fbf64b9f-qltt5                    1/1     Running             1          25m
gooddata       gooddata-cn-redis-ha-server-2                         3/3     Running             3          22m
gooddata       gooddata-cn-redis-ha-server-1                         3/3     Running             3          23m
pulsar         pulsar-zookeeper-1                                    1/1     Running             1          28m
gooddata       gooddata-cn-afm-exec-api-7bf559b487-4x2rv             1/1     Running             1          26m
gooddata       gooddata-cn-dex-647658ccd-rzbs5                       1/1     Running             1          26m
gooddata       gooddata-cn-dex-647658ccd-25bmm                       1/1     Running             1          26m
pulsar         pulsar-zookeeper-0                                    1/1     Running             1          33m
gooddata       gooddata-cn-analytical-designer-794497c4f-4xrbj       1/1     Running             1          25m
pulsar         pulsar-recovery-0                                     1/1     Running             1          33m
cert-manager   cert-manager-cainjector-86bc6dc648-tvjjx              1/1     Running             3          69m
gooddata       gooddata-cn-dashboards-84c85df6db-zkfvq               1/1     Running             1          26m
kube-system    coredns-7448499f4d-c6ssc                              1/1     Running             1          72m
gooddata       gooddata-cn-ldm-modeler-59ddbdc696-d6hkf              1/1     Running             1          26m
cert-manager   cert-manager-bf6c77cbc-svcgn                          1/1     Running             1          69m
gooddata       gooddata-cn-apidocs-594d6dbf44-btvlj                  1/1     Running             1          26m
kube-system    svclb-traefik-2lmks                                   2/2     Running             2          69m
gooddata       gooddata-cn-aqe-84998b8596-bxlnk                      1/1     Running             1          26m
gooddata       gooddata-cn-measure-editor-595d576df4-gtccc           1/1     Running             1          26m
gooddata       gooddata-cn-home-ui-68d6766f84-85hgn                  1/1     Running             1          26m
kube-system    local-path-provisioner-5ff76fc89d-d6k7v               1/1     Running             3          72m
gooddata       gooddata-cn-scan-model-58878777bd-x2hxs               1/1     Running             1          26m
gooddata       gooddata-cn-organization-controller-fcd74d885-d7b87   1/1     Running             1          26m
cert-manager   cert-manager-webhook-78b6f5dfcc-xs66l                 1/1     Running             1          69m
kube-system    svclb-traefik-slxtx                                   2/2     Running             2          69m
kube-system    traefik-97b44b794-bmzgd                               1/1     Running             1          69m
gooddata       gooddata-cn-result-cache-fc48f7bf8-lkfcw              1/1     Running             1          26m
gooddata       gooddata-cn-db-postgresql-1                           2/2     Running             2          26m
gooddata       gooddata-cn-db-pgpool-c8c4b9878-d89pd                 1/1     Running             3          26m
gooddata       gooddata-cn-db-pgpool-c8c4b9878-79jc9                 1/1     Running             2          26m
gooddata       gooddata-cn-sql-executor-7d58bc465c-cwd5f             1/1     Running             1          26m
gooddata       gooddata-cn-metadata-api-6f7fdf5557-9zdj2             1/1     Running             1          26m
pulsar         pulsar-bookie-2                                       0/1     CrashLoopBackOff    6          33m
pulsar         pulsar-bookie-1                                       0/1     CrashLoopBackOff    6          33m
pulsar         pulsar-bookie-0                                       0/1     CrashLoopBackOff    6          33m
pulsar         pulsar-broker-1                                       1/1     Running             3          33m
pulsar         pulsar-broker-0                                       1/1     Running             3          33m
gooddata       gooddata-cn-cache-gc-27264720-v84m2                   0/1     ContainerCreating   0          6s

(*) EDIT: I'm looking into it now that I have more time and figuring out what's the correct new format. EDIT2: I cannot find good documentation on the new format, the official usage docs aren't even consistent... tried a few things and they didn't work, so I'm downgrading k3d to 4.x and trying again.

Daniel Chýlek

11/03/2021, 7:02 AM

~~After downgrading and reinstalling from scratch, now even pulsar is failing with the timeout message~~ 😕 ~~guess I'll run the script again, maybe something is just taking too much time~~ Looks like my computer ran out of space, too bad there's not a better error message for that :D

Daniel Chýlek

11/03/2021, 8:27 AM

Unfortunately downgrading k3d to 4.x also didn't help, still getting a context deadline error:

Copy code

Running: helm -n pulsar upgrade --wait --timeout 7m --install pulsar --values /tmp/values-pulsar-k3d.yaml --set initialize=true --version 2.7.2 <https://github.com/apache/pulsar-helm-chart/releases/download/pulsar-2.7.2/pulsar-2.7.2.tgz>
Release "pulsar" does not exist. Installing it now.
W1103 07:57:27.103143  188603 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1103 07:57:27.108070  188603 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1103 07:57:27.111029  188603 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1103 07:57:27.128958  188603 warnings.go:70] <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1> ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1> ClusterRole
W1103 07:57:27.197557  188603 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1103 07:57:27.201892  188603 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1103 07:57:27.203011  188603 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W1103 07:57:27.247392  188603 warnings.go:70] <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1> ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1> ClusterRole
NAME: pulsar
LAST DEPLOYED: Wed Nov  3 07:57:25 2021
NAMESPACE: pulsar
STATUS: deployed
REVISION: 1
TEST SUITE: None
Running: helm -n gooddata upgrade --install gooddata-cn --wait --timeout 7m --values /tmp/values-gooddata-cn.yaml --version 1.4.0 gooddata/gooddata-cn
Release "gooddata-cn" does not exist. Installing it now.
W1103 08:01:33.318907  204576 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
W1103 08:01:46.093635  204576 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
Error: rate: Wait(n=1) would exceed context deadline

Copy code

chylek@ubuntu:~/gd/k3d$ kubectl get pods -A
NAMESPACE      NAME                                                  READY   STATUS             RESTARTS   AGE
kube-system    local-path-provisioner-5ff76fc89d-n4z5g               1/1     Running            0          28m
kube-system    metrics-server-86cbb8457f-prjqm                       1/1     Running            0          28m
kube-system    coredns-7448499f4d-zxdrw                              1/1     Running            0          28m
kube-system    helm-install-cert-manager-bgh48                       0/1     Completed          0          28m
cert-manager   cert-manager-webhook-78b6f5dfcc-crwv2                 1/1     Running            0          27m
cert-manager   cert-manager-bf6c77cbc-wg4fx                          1/1     Running            0          27m
kube-system    helm-install-ingress-nginx-7wbwh                      0/1     Completed          0          28m
kube-system    svclb-ingress-nginx-controller-57544                  2/2     Running            0          27m
kube-system    svclb-ingress-nginx-controller-mbjvc                  2/2     Running            0          27m
kube-system    svclb-ingress-nginx-controller-jbgkn                  2/2     Running            0          27m
kube-system    ingress-nginx-controller-6d64b8fb47-t9bs7             1/1     Running            0          27m
pulsar         pulsar-zookeeper-0                                    1/1     Running            0          27m
pulsar         pulsar-zookeeper-1                                    1/1     Running            0          25m
pulsar         pulsar-zookeeper-2                                    1/1     Running            0          24m
pulsar         pulsar-bookie-init-xfww2                              0/1     Completed          0          27m
pulsar         pulsar-recovery-0                                     1/1     Running            0          27m
pulsar         pulsar-pulsar-init-9p695                              0/1     Completed          0          27m
pulsar         pulsar-bookie-1                                       1/1     Running            0          27m
pulsar         pulsar-bookie-0                                       1/1     Running            0          27m
pulsar         pulsar-bookie-2                                       1/1     Running            0          27m
pulsar         pulsar-broker-0                                       1/1     Running            0          27m
pulsar         pulsar-broker-1                                       1/1     Running            0          27m
gooddata       gooddata-cn-measure-editor-595d576df4-rkwhf           1/1     Running            0          22m
gooddata       gooddata-cn-ldm-modeler-59ddbdc696-knmpn              1/1     Running            0          22m
gooddata       gooddata-cn-apidocs-594d6dbf44-4zjdj                  1/1     Running            0          22m
cert-manager   cert-manager-cainjector-86bc6dc648-pqx9g              1/1     Running            1          27m
gooddata       gooddata-cn-tools-67fbf64b9f-gxjzn                    1/1     Running            0          22m
gooddata       gooddata-cn-metadata-api-6f7fdf5557-6gr7n             0/1     Init:0/1           0          22m
gooddata       gooddata-cn-home-ui-68d6766f84-lct57                  1/1     Running            0          22m
gooddata       gooddata-cn-dashboards-84c85df6db-5fw96               1/1     Running            0          22m
gooddata       gooddata-cn-aqe-84998b8596-7hfc6                      1/1     Running            0          22m
gooddata       gooddata-cn-analytical-designer-794497c4f-p5sgn       1/1     Running            0          22m
gooddata       gooddata-cn-apidocs-594d6dbf44-24fh8                  1/1     Running            0          22m
gooddata       gooddata-cn-organization-controller-fcd74d885-rgwmg   1/1     Running            0          22m
gooddata       gooddata-cn-sql-executor-7d58bc465c-hpswt             0/1     Init:0/1           0          22m
gooddata       gooddata-cn-dex-647658ccd-rcch2                       0/1     Init:0/1           0          22m
gooddata       gooddata-cn-dex-647658ccd-vxprl                       0/1     Init:0/1           0          22m
gooddata       gooddata-cn-result-cache-fc48f7bf8-7d7h4              1/1     Running            0          22m
gooddata       gooddata-cn-auth-service-8654cbff5d-ct57l             1/1     Running            0          22m
gooddata       gooddata-cn-afm-exec-api-7bf559b487-r2lgx             1/1     Running            0          22m
gooddata       gooddata-cn-scan-model-58878777bd-ww2v2               1/1     Running            0          22m
gooddata       gooddata-cn-db-postgresql-0                           1/2     CrashLoopBackOff   8          22m
gooddata       gooddata-cn-db-pgpool-c8c4b9878-nb2h5                 0/1     CrashLoopBackOff   9          22m
gooddata       gooddata-cn-db-pgpool-c8c4b9878-gkkdl                 0/1     CrashLoopBackOff   9          22m
gooddata       gooddata-cn-redis-ha-server-0                         0/3     Init:Error         9          22m
gooddata       gooddata-cn-db-postgresql-1                           1/2     CrashLoopBackOff   9          22m

Robert Moucha

11/03/2021, 8:34 AM

Can you please share information about the HW where you're running the script? I'm notably interested in number of CPU cores and memory size.

Robert Moucha

11/03/2021, 8:38 AM

The

CrashLoopBackOff

and

Init:Error

from pods with volumes suggest there is some issue with data persistence.

Daniel Chýlek

11/03/2021, 8:49 AM

I'm running Ubuntu in VirtualBox, with 16 GB RAM, 6 cores, and 150 GB disk. I can give it up to 48 GB RAM if needed, but I'm maxed out on cores and the disk usage is reported at 42 GB so there should not be an issue there.

Robert Moucha

11/03/2021, 9:51 AM

well, this should be enough. Maybe there were some remainders from the previous attempt with k3d 5.x. Please clean up the environment:

Copy code

docker network disconnect k3d-default k3d-registry
k3d cluster delete default
docker rm -f k3d-registry
docker volume rm registry-data
docker system prune -a -f --volumes

And then run the script again. It will preserve the CA cert, but the rest will be recreated. I can try locally with VBox, didn't tested it yet this way.

Robert Moucha

11/03/2021, 9:59 AM

What Ubuntu version are you using? And how do you start it - directly in VBox, or using multipass?

Daniel Chýlek

11/03/2021, 11:53 AM

I'll try, thanks. I start it directly from VBox, using Ubuntu 20.04.3 LTS. The host machine is an Intel Mac.

Daniel Chýlek

11/03/2021, 2:00 PM

Various things keep timing out, once it was pulsar, once it was cert-manager... is there any info on what exactly is timing out?

Copy code

Waiting for cert-manager to come up
deployment.apps/cert-manager condition met
timed out waiting for the condition on deployments/cert-manager-webhook
timed out waiting for the condition on deployments/cert-manager-cainjector

Robert Moucha

11/03/2021, 2:37 PM

the script tests all the Deployments that are a part of the cert-manager helm chart. It's really surprising that it times out at such an early stage

Robert Moucha

11/03/2021, 2:38 PM

This is what script does:

Copy code

kubectl -n cert-manager wait deployment --for=condition=available --selector=<http://app.kubernetes.io/instance=cert-manager|app.kubernetes.io/instance=cert-manager>

It's a basically script-friendly version of

kubectl -n cert-manager get deployment

Robert Moucha

11/03/2021, 2:39 PM

default timeout is 30s, it should be enough for such a simple application like cert-manager

Daniel Chýlek

11/03/2021, 2:46 PM

If I run

kubectl get pods -A

, it shows that cert-manager-webhook and cert-manager-cainjector are both running, so I don't know if they took longer than expected, or if whatever is checking the timeout has errored out and couldn't get the correct state of the pods...

Robert Moucha

11/03/2021, 2:46 PM

In the meantime, I'm trying to simulate your env (vbox w/ubuntu, 6vCPU, 10GB RAM), and the VM overhead is way too high

Robert Moucha

11/03/2021, 2:46 PM

so everything runs much slower than I expected

Daniel Chýlek

11/03/2021, 2:47 PM

Perhaps it would run better if I installed it in a docker ubuntu container instead of a VM?

Robert Moucha

11/03/2021, 2:49 PM

do you have docker installed directly on your mac book? maybe the apple's virtualization might be more efficient than virtualbox

Robert Moucha

11/03/2021, 2:50 PM

I do not recommend running ubuntu in docker, where you would start docker with kubernetes that would run yet another docker layer. docker-in-docker-in-docker 😕

Robert Moucha

11/03/2021, 2:53 PM

If you were running docker-desktop directly on macos, you could allocate sufficient resources to docker VM (16 GB RAM, 6 cores, the same as you did to Virtualbox) and run the script directly in terminal

Daniel Chýlek

11/03/2021, 2:53 PM

I wanted to avoid running K8s directly on my machine, I'd like to be able to wipe it out completely and start over if something goes wrong without having to hunt down all the places it touched.

Robert Moucha

11/03/2021, 2:53 PM

it will NOT run directly on your machine

Robert Moucha

11/03/2021, 2:54 PM

k3d runs within docker

Robert Moucha

11/03/2021, 2:55 PM

the only artifact (except the four docker containers) that will remain on your host, is $HOME/.kube/config file. Nothing else

Daniel Chýlek

11/03/2021, 2:55 PM

Robert Moucha

11/03/2021, 2:57 PM

sorry, five containers:

Copy code

ubuntu@ubuntu2004:~/gooddata-cn-tools/k3d$ docker ps
CONTAINER ID   IMAGE                      COMMAND                  CREATED       STATUS       
b25f05bd623f   rancher/k3d-proxy:4.4.8    "/bin/sh -c nginx-pr…"   2 hours ago   Up 2 hours   
4d85b08a51bb   rancher/k3s:v1.21.3-k3s1   "/bin/k3s agent"         2 hours ago   Up 2 hours   
5cba8cd5bc3d   rancher/k3s:v1.21.3-k3s1   "/bin/k3s agent"         2 hours ago   Up 2 hours   
f41cb0f08b10   rancher/k3s:v1.21.3-k3s1   "/bin/k3s server --n…"   2 hours ago   Up 2 hours   
12af2571cacc   registry:2                 "/entrypoint.sh /etc…"   2 hours ago   Up 29 minutes

Robert Moucha

11/03/2021, 2:58 PM

and you may always simply wipe the whole cluster by

k3d cluster delete default

as described above. The registry remains intentionally, because it holds cached images for faster future deployments

Daniel Chýlek

11/03/2021, 3:16 PM

port 5000 is not a good default for the registry on mac, that port is already used by the OS (same on Windows, apparently)

Robert Moucha

11/03/2021, 3:35 PM

it should not be an issue to change.

Robert Moucha

11/03/2021, 3:36 PM

the worse thing is that in k3d 4.4.8 is an error (actually in k3s) that breaks some non-root containers (like postgres or redis) because of volume permission issue. The fix is fairly simple.

Robert Moucha

11/03/2021, 3:37 PM

I will also remap the port 5000 somewhere else. Will the 5050 work for you?

Daniel Chýlek

11/03/2021, 3:46 PM

I have disabled AirPlay which was using the port just to avoid any complications, but for the future 5050 sounds good. Please let me know about the volume permission issue, since I installed 4.4.8; unfortunately I'm still running into this context error:

Copy code

Release "gooddata-cn" does not exist. Installing it now.
W1103 16:30:31.797746   26278 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
W1103 16:30:40.318584   26278 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
Error: rate: Wait(n=1) would exceed context deadline

These are my Docker Desktop settings:

Robert Moucha

11/03/2021, 4:16 PM

I updated the script in repo. Now I'm running test in my vbox

Robert Moucha

11/03/2021, 4:17 PM

pulsar timed out, but it was deployed properly. now it installs gooddata-cn, so far no errors and volumes were configured properly

Robert Moucha

11/03/2021, 4:18 PM

--timeout 7m

for pulsar deployment is not sufficient for virtualbox.

Daniel Chýlek

11/03/2021, 4:19 PM

This is the state of pods after I ran your script on my Mac:

Copy code

chylek@Daniel-iMac k3d % kubectl get pods -A
NAMESPACE      NAME                                                  READY   STATUS                  RESTARTS   AGE
kube-system    coredns-7448499f4d-k4fxk                              1/1     Running                 0          60m
kube-system    local-path-provisioner-5ff76fc89d-rj987               1/1     Running                 0          60m
kube-system    metrics-server-86cbb8457f-6q767                       1/1     Running                 0          60m
kube-system    helm-install-cert-manager-5s5rv                       0/1     Completed               0          60m
kube-system    helm-install-ingress-nginx-6txq9                      0/1     Completed               0          60m
cert-manager   cert-manager-cainjector-86bc6dc648-6pxfb              1/1     Running                 0          59m
kube-system    svclb-ingress-nginx-controller-49z5k                  2/2     Running                 0          59m
kube-system    svclb-ingress-nginx-controller-l2h7q                  2/2     Running                 0          59m
kube-system    svclb-ingress-nginx-controller-gld4j                  2/2     Running                 0          59m
cert-manager   cert-manager-webhook-78b6f5dfcc-jzj9g                 1/1     Running                 0          59m
cert-manager   cert-manager-bf6c77cbc-2zlq7                          1/1     Running                 0          59m
kube-system    ingress-nginx-controller-6d64b8fb47-mxh7n             1/1     Running                 0          59m
pulsar         pulsar-zookeeper-0                                    1/1     Running                 0          51m
pulsar         pulsar-zookeeper-1                                    1/1     Running                 0          49m
pulsar         pulsar-bookie-init-ckz4s                              0/1     Completed               0          51m
pulsar         pulsar-recovery-0                                     1/1     Running                 0          51m
pulsar         pulsar-pulsar-init-6m7r6                              0/1     Completed               0          51m
pulsar         pulsar-zookeeper-2                                    1/1     Running                 0          48m
pulsar         pulsar-bookie-0                                       1/1     Running                 0          51m
pulsar         pulsar-bookie-1                                       1/1     Running                 0          51m
pulsar         pulsar-bookie-2                                       1/1     Running                 0          51m
pulsar         pulsar-broker-0                                       1/1     Running                 0          51m
pulsar         pulsar-broker-1                                       1/1     Running                 0          51m
gooddata       gooddata-cn-measure-editor-595d576df4-qmgzv           1/1     Running                 0          47m
gooddata       gooddata-cn-aqe-84998b8596-f494v                      1/1     Running                 0          47m
gooddata       gooddata-cn-home-ui-68d6766f84-v8fxk                  1/1     Running                 0          47m
gooddata       gooddata-cn-analytical-designer-794497c4f-jlpvm       1/1     Running                 0          47m
gooddata       gooddata-cn-ldm-modeler-59ddbdc696-6xtzr              1/1     Running                 0          47m
gooddata       gooddata-cn-apidocs-594d6dbf44-6g6w9                  1/1     Running                 0          47m
gooddata       gooddata-cn-apidocs-594d6dbf44-4cvgm                  1/1     Running                 0          47m
gooddata       gooddata-cn-dashboards-84c85df6db-l9vp8               1/1     Running                 0          47m
gooddata       gooddata-cn-organization-controller-fcd74d885-jqqkm   1/1     Running                 0          47m
gooddata       gooddata-cn-tools-67fbf64b9f-jspmd                    1/1     Running                 0          47m
gooddata       gooddata-cn-dex-647658ccd-s2tk5                       0/1     Init:0/1                0          47m
gooddata       gooddata-cn-sql-executor-7d58bc465c-zxn66             0/1     Init:0/1                0          47m
gooddata       gooddata-cn-scan-model-58878777bd-rpfqv               1/1     Running                 0          47m
gooddata       gooddata-cn-auth-service-8654cbff5d-sbb4x             1/1     Running                 0          47m
gooddata       gooddata-cn-metadata-api-6f7fdf5557-qhltg             0/1     Init:0/1                0          47m
gooddata       gooddata-cn-dex-647658ccd-z29qx                       0/1     Init:0/1                0          47m
gooddata       gooddata-cn-afm-exec-api-7bf559b487-gbjpf             1/1     Running                 0          47m
gooddata       gooddata-cn-result-cache-fc48f7bf8-j8z2x              1/1     Running                 0          47m
gooddata       gooddata-cn-cache-gc-27265920-kvvzj                   0/1     Completed               0          18m
gooddata       gooddata-cn-db-postgresql-1                           1/2     CrashLoopBackOff        13         47m
gooddata       gooddata-cn-db-postgresql-0                           1/2     CrashLoopBackOff        13         47m
gooddata       gooddata-cn-db-pgpool-c8c4b9878-hvrj5                 0/1     CrashLoopBackOff        15         47m
gooddata       gooddata-cn-redis-ha-server-0                         0/3     Init:CrashLoopBackOff   14         47m
gooddata       gooddata-cn-db-pgpool-c8c4b9878-9v5sd                 0/1     Running                 16         47m

Just like in the VM, connecting to localhost shows me a 404.

Robert Moucha

11/03/2021, 4:20 PM

did you used the updated script? what version of k3s image is reported when you run

docker ps

Robert Moucha

11/03/2021, 4:21 PM

It shoud be

rancher/k3s:v1.21.4-k3s1

if it is older (like

rancher/k3s:v1.21.*3*-k3s1

) please update script from repo

Daniel Chýlek

11/03/2021, 4:22 PM

It is .3, I will update and try again.

Robert Moucha

11/03/2021, 4:23 PM

I apologize for complications - I run older k3d 4.4.4 locally and it works as a charm. Recent updates in k3s and k3d introduced these errors I was not aware of.

Robert Moucha

11/03/2021, 4:24 PM

before you start updated script, please run

k3d cluster delete default

Daniel Chýlek

11/03/2021, 4:31 PM

The updated script is missing the 'p' argument in getopts:

Copy code

- while getopts "cH:" o; do
+ while getopts "cH:p:" o; do

No worries about the complications, I appreciate your help

Robert Moucha

11/03/2021, 4:33 PM

ha thanks! I always forgot to update getopts line when I add a new argument 🙂

Robert Moucha

11/03/2021, 4:35 PM

fyi, I successfully (just with a single timeout on pulsar helm install) installed the gooddata.cn , created org and UI seems to be working:

Daniel Chýlek

11/03/2021, 4:42 PM

Mine has also finished installing successfully, so it looks like the main problem was the K3D version. Thanks for help!

👏 1

Daniel Chýlek

11/10/2021, 1:52 PM

Hi, we're having an issue trying to use the same installation process using your scripts on our testing server. For some reason the kernel keeps killing nginx processes so the installation never finishes:

Copy code

Nov 10 14:29:45 testing kernel: [1549711.046901] Memory cgroup out of memory: Kill process 2301795 (nginx) score 1277 or sacrifice child
Nov 10 14:29:45 testing kernel: [1549711.046907] Killed process 2303262 (nginx) total-vm:10948kB, anon-rss:1300kB, file-rss:1988kB, shmem-rss:0kB
Nov 10 14:29:45 testing kernel: [1549711.055984] oom_reaper: reaped process 2303262 (nginx), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Nov 10 14:29:45 testing kernel: [1549711.198409] nginx invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=999
Nov 10 14:29:45 testing kernel: [1549711.198411] nginx cpuset=0648fca50b349420d4104fa001615ee60034c1d18518ebd907f94f7e4829c40b mems_allowed=0-1

(...)

Nov 10 14:29:45 testing kernel: [1549711.198484] Task in /docker/becbdc9b7521b71a28f4cc8ec213513ffad37421009ae25240a41c43f397c5e2/kubepods/burstable/pod078e9da9-37aa-4216-b9b0-20072792ac17/0648fca50b349420d4104fa001615ee60034c1d18518ebd907f94f7e4829c40b killed as a result of limit of /docker/becbdc9b7521b71a28f4cc8ec213513ffad37421009ae25240a41c43f397c5e2/kubepods/burstable/pod078e9da9-37aa-4216-b9b0-20072792ac17
Nov 10 14:29:45 testing kernel: [1549711.198492] memory: usage 20480kB, limit 20480kB, failcnt 30832
Nov 10 14:29:45 testing kernel: [1549711.198493] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
Nov 10 14:29:45 testing kernel: [1549711.198494] kmem: usage 17820kB, limit 9007199254740988kB, failcnt 0
Nov 10 14:29:45 testing kernel: [1549711.198495] Memory cgroup stats for /docker/becbdc9b7521b71a28f4cc8ec213513ffad37421009ae25240a41c43f397c5e2/kubepods/burstable/pod078e9da9-37aa-4216-b9b0-20072792ac17: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Nov 10 14:29:45 testing kernel: [1549711.198501] Memory cgroup stats for /docker/becbdc9b7521b71a28f4cc8ec213513ffad37421009ae25240a41c43f397c5e2/kubepods/burstable/pod078e9da9-37aa-4216-b9b0-20072792ac17/5f517b59efb059ee85a5946d70ba12167613275c425de78378dcfd4c7caeaf3b: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB active_file:0KB unevictable:0KB
Nov 10 14:29:45 testing kernel: [1549711.198507] Memory cgroup stats for /docker/becbdc9b7521b71a28f4cc8ec213513ffad37421009ae25240a41c43f397c5e2/kubepods/burstable/pod078e9da9-37aa-4216-b9b0-20072792ac17/0e0838ef9bef7ed29e5e59de946c2afa1354210f097f65defc5f36ba12359b4d: cache:0KB rss:124KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Nov 10 14:29:45 testing kernel: [1549711.198513] Memory cgroup stats for /docker/becbdc9b7521b71a28f4cc8ec213513ffad37421009ae25240a41c43f397c5e2/kubepods/burstable/pod078e9da9-37aa-4216-b9b0-20072792ac17/77c0aa7d56011f59b60b50b5a663cc96b7bee931d1586ebc22408527df5f2d2f: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Nov 10 14:29:45 testing kernel: [1549711.198519] Memory cgroup stats for /docker/becbdc9b7521b71a28f4cc8ec213513ffad37421009ae25240a41c43f397c5e2/kubepods/burstable/pod078e9da9-37aa-4216-b9b0-20072792ac17/0648fca50b349420d4104fa001615ee60034c1d18518ebd907f94f7e4829c40b: cache:0KB rss:2292KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:2504KB inactive_file:36KB active_file:0KB unevictable:0KB

When checking

docker stats

, all k3d containers report the memory limit as over 700 GB, so we're definitely not hitting that. Checking

/sys/fs/cgroup/memory/memory.limit_in_bytes

reports 64-bit max value, so there's no limit set there either. Checking

/sys/fs/cgroup/memory/docker/<hash>/kubepods/burstable/<hash>

shows a limit of 20480 kB this is hitting, but we don't know why that limit even exists on the testing server, or how to get rid of it. Any ideas?

Robert Moucha

11/12/2021, 1:27 PM

Hi Dan, what linux distro, kernel version and docker version you're using? There were some recent changes in cgroup v2 that could be incompatible with your setup

Robert Moucha

11/12/2021, 2:16 PM

ok, after more investigation, I have found that three of our services have resources.limits.memory set to 20Mi (dashboards, home-ui and measure-editor). It's really surprising they won't fit to 20MB on your machine. Usually the consume less than 5MB:

Copy code

kubectl top pod -n gooddata 
NAME                                                  CPU(cores)   MEMORY(bytes)   
gooddata-cn-dashboards-84c85df6db-w2c5q               1m           3Mi             
gooddata-cn-home-ui-68d6766f84-p2cc9                  1m           4Mi             
gooddata-cn-measure-editor-595d576df4-nsvc5           1m           3Mi

You may increase resource limits by adding the following yaml snippet to `values-gooddata-cn.yaml`:

Copy code

measureEditor:
  resources:
    limits:
      memory: 30Mi
homeUi:
  resources:
    limits:
      memory: 30Mi
dashboards:
  resources:
    limits:
      memory: 30Mi

(you may experiment with the limit size, 30Mi is just an example)

Daniel Chýlek

11/15/2021, 12:55 PM

this is what we're seeing

Copy code

NAME CPU(cores) MEMORY(bytes)

gooddata-cn-dashboards-84c85df6db-96624     5314m 17Mi
gooddata-cn-measure-editor-595d576df4-4rd6h    0m 17Mi
gooddata-cn-home-ui-68d6766f84-8fsv2           1m 17Mi

which is strange considering it's a completely fresh installation

Daniel Chýlek

11/15/2021, 12:56 PM

Copy code

Ubuntu 20.04.3 LTS (GNU/Linux 4.15.0-161-generic x86_64)
Docker version 20.10.7, build 20.10.7-0ubuntu5~20.04.1

Daniel Chýlek

11/15/2021, 1:44 PM

it looked like it started with the higher limits, but now kubectl died with "context deadline exceeded" 😕

Robert Moucha

11/16/2021, 10:33 AM

dashboards consuming 5 CPUs is really suspicious, considering the fact it's just an unprivileged nginx image serving bunch of static files. Did all the remaining pods started and are healthy? There might be some pod restarts during the deployment phase (but number of restarts should be less than 3 and must not grow). context deadline exceeded is Go's ugly name for timeout. It seems the kubernetes api is overloaded

Robert Moucha

11/16/2021, 10:35 AM

Does the situation as shown above (cpu-hogging dashboards pod) still persists?

Daniel Chýlek

11/22/2021, 4:23 PM

It looks like GD has loaded, but it doesn't have the correct port; we set LBPORT and LBSSLPORT to numbers in the 10000 range, but when we access the site, it redirects to localhost/dex/auth with no port EDIT: Realized there is a parameter for this; assuming that it's supposed to have the port, it'd be nice if it defaulted to

localhost:${LBSSLPORT}

unless there is some problem with that?

Daniel Chýlek

11/22/2021, 5:15 PM

So, apparently gooddata-cn-dex does not support a custom port in authHost, and I cannot find any documentation for the format of the dex section in yaml

Robert Moucha

11/23/2021, 8:31 AM

You're right, I didn't considered this option, I was always happy with load balancer running on 443 😉 I didn't tested yet, but this could fix it:

Copy code

--- a/k3d/k3d.sh
+++ b/k3d/k3d.sh
@@ -278,6 +278,7 @@ cookiePolicy: None
 dex:
   ingress:
     authHost: $authHost
+    lbPort: ${LBSSLPORT:+:$LBSSLPORT}
     annotations:
       <http://cert-manager.io/issuer|cert-manager.io/issuer>: ca-issuer
     tls:

Robert Moucha

11/23/2021, 8:31 AM

I will check later

Daniel Chýlek

11/23/2021, 8:45 AM

Unfortunately it's still redirecting to localhost with no port

Robert Moucha

11/23/2021, 8:55 AM

ah, my bad, the value is invalid.

${LBSSLPORT:+$LBSSLPORT}

Daniel Chýlek

11/23/2021, 9:55 AM

I forgot to mention, I also tried just

${LBSSLPORT}

since I wasn't familiar with this particular bash syntax, but both

${LBSSLPORT}

and

${LBSSLPORT:+$LBSSLPORT}

are still redirecting to localhost with no port; also tried with browser cache disabled

Robert Moucha

11/23/2021, 1:01 PM

Are you talking about a fresh deployment with updated script, or did you used the script to update existing deployment (i.e. without

-c

parameter)?

Robert Moucha

11/23/2021, 1:03 PM

If the organization already exists and you performed helm upgrade with given

lbPort

value, chances are the internal state of the organization is keeping wrong values. You may try to delete organization (kubectl delete org XYZ) and recreate it again.

Robert Moucha

11/23/2021, 1:04 PM

But anyway I have a few minutes finally to test fixed script

Daniel Chýlek

11/23/2021, 2:30 PM

I didn't rerun with -c, I will try that. Thanks.

Robert Moucha

11/23/2021, 2:50 PM

I performed some tests

Robert Moucha

11/23/2021, 2:50 PM

It doesn't work, really

Robert Moucha

11/23/2021, 2:53 PM

this is a limitation of k3d k3d loadbalancer (docker container k3d-default-serverlb running nginx) has mapped port, e.g.

0.0.0.0:8443->443/tcp

(asuming you have LBSSLPORT="8443"). But this nginx it has no idea about the mapping itself, so it doesn't set proper X-Forwarded-Port header

Robert Moucha

11/23/2021, 2:55 PM

therefore applications within the cluster don't know about the external port, because they rely on this header (and there's 443 because k3d-default-serverlb actually listens on 443).

Robert Moucha

11/23/2021, 3:00 PM

maybe I can fix it.

Daniel Chýlek

11/23/2021, 3:09 PM

meanwhile I can't tell if I misconfigured something or if docker decided it wasn't going to work today 😂 it's getting stuck on starting the first agent node

Robert Moucha

11/23/2021, 3:54 PM

nah, it looks more like you accicentally pressed CTRL+\

Daniel Chýlek

11/23/2021, 3:55 PM

it was stuck for minutes, I quit it manually

Robert Moucha

11/23/2021, 3:56 PM

anyway, I managed to convince k3d and ingress controller to work together with non-default tls port, but there are still some configuration issues.

Daniel Chýlek

11/23/2021, 3:56 PM

ctrl+\ instead of ctrl+c was an accident but I figured the stack traces might be useful

Robert Moucha

11/23/2021, 3:57 PM

Yeah, good old SIGQUIT, I think it works with java as well.

Daniel Chýlek

11/23/2021, 3:59 PM

had the sysadmin restart docker, it's working again now

Daniel Chýlek

11/23/2021, 4:02 PM

a bit worried what could've possibly happened, considering the previous weird behaviors

Robert Moucha

11/23/2021, 4:17 PM

who knows 🤷 The great benefit of docker is you can always throw everything away and start from scratch, without risking your main system corruption.

Daniel Chýlek

11/23/2021, 4:20 PM

yep, well this time it needed a restart of the whole docker service because I threw the containers away and they didn't recreate and start anymore 😂

Daniel Chýlek

11/23/2021, 6:24 PM

hmm, would it be possible to skip the load balancer and directly expose a port to the backend? we ended up coming with a possibly somewhat insane system, which allows us to run GD CN fully locally - hidden behind our own proxy that authenticates with GD, so we have completely seamless integration of GD UI widgets - so all we would need is

localhost:port

where we can call GD APIs

Robert Moucha

11/25/2021, 10:29 AM

The k3d LB is a plain L4 TCP loadbalancer, so there's currently no possibility to manage HTTP headers. It means that ingress-nginx doesn't recieve information about the upstream port of the incoming request: [client] -> (:8443)[k3d TCP LB] -> (:8443)[Ingress svc] -> (:443)[Ingress pod) -> (:appport)[backend svc] The only option how to resolve it would me to make Ingress ctl (both svc and container) to listen on the same port as the k3d LB. It is possible to do - we need to generate ingress-nginx.yaml from some template and fill in the helm values according to setup specified in k3d.sh:

Copy code

apiVersion: <http://helm.cattle.io/v1|helm.cattle.io/v1>
kind: HelmChart
...
  valuesContent: |-
    controller:
      containerPort:
        https: $LBSSLPORT
      service:
        ports:
          https: $LBSSLPORT
    ...

Robert Moucha

11/25/2021, 10:32 AM

Now these two values are set to default 443 that works perfectly with default k3d LB port 443.

Daniel Chýlek

11/29/2021, 5:43 PM

I assume that the ingress-nginx.yaml in your scripts is not sufficient then; do you have a recommended template that should work?

Robert Moucha

11/30/2021, 12:57 AM

Not yet, I need to rework my script so the ingress-nginx.yaml will be generated.

Albert Kristof

12/06/2021, 3:11 PM

Hi @Daniel Chýlek, could you please let us know what’s driving your need to run the service on a port different to the default 443?

Daniel Chýlek

12/06/2021, 3:43 PM

Hi, our goal is to have seamless integration of GD UI widgets in our existing website. The plan is for GD UI to communicate with a reverse proxy, which directly communicates with a GD CN instance on the backend. We find that a reverse proxy is the easiest way for us to integrate GD, since the reverse proxy can use our own customer database to choose the correct GD instance (internal port), workspace, and automatically authenticate the user through an API token without exposing any of it on the frontend. I know it's not in the spirit of kubernetes, but ultimately we want to run GD on the same server as our other services (at least until we start running into major performance issues), with a port that is not publicly exposed and does not conflict with our existing services; so any standard http/https ports are not available.

Robert Moucha

12/06/2021, 10:04 PM

OK, now I understand your use-case.

Robert Moucha

12/06/2021, 10:15 PM

The purpose of k3d setup is for development or evaluation. Running such a complex system on a single docker host in production is strongly discouraged. As far as the custom port is concerned, I have found bug in ingress-nginx that prevents the custom port setup from working. The problem is that they have a hard-coded value

in one Lua script and they pass this value to upstream servers in X-Forwarded-Port header, instead of the real port where ingress-nginx controller operates.

Daniel Chýlek

12/07/2021, 6:06 AM

I understand, but we have a small amount of high-powered servers rather than a highly scalable environment, so we keep all services relevant to a customer (website, database, etc.) on the same machine. Our testing server where we're trying to get GD running now is a single machine, which is going to be a worst case scenario, but we can use it to judge the performance and decide if we actually need to do something different to deploy GD to production.

Robert Moucha

12/12/2021, 7:03 PM

Dan, I have finally updated my k3d deployment script in https://github.com/mouchar/gooddata-cn-tools There are several changes that you might be interested in: • support of k3d 5.x • ability to completely disable built-in TLS support (

-n

), making the cluster to expose only HTTP port, no cert-manager stuff etc.) • It should work behind reverse proxy on non-default port, but the proxy MUST pass the

X-Forwarded-*

headers.

👏 2

Daniel Chýlek

01/10/2022, 10:12 AM

Hi, the reverse proxy works 🎉 but it's having issues with the organization hostname. For testing, we setup the reverse proxy behind a subdomain that's only accessible locally. I created an organization that has the hostname equal to the subdomain and ran

kubectl apply -f org-quant-test.yml

, but GD cannot find it:

Daniel Chýlek

01/10/2022, 10:16 AM

I realized I forgot to call the /organizations/ API endpoint. I assumed the error was talking about the kubernetes organization, but maybe that's not the case.

Daniel Chýlek

01/10/2022, 2:32 PM

unfortunately the /api/entities/admin/organizations/quant-gd-test endpoint is throwing some default tomcat 404 error

Robert Moucha

01/11/2022, 7:24 PM

organizations endpoint really refers to the same Organization as it is in k8s custom resource. Please check if the organization resource exists and what is its status (

kubectl describe org  quant-gd-test

) The config looks valid. It's possible the record for the organization was not propagated to the database. Also check logs of organization-controller pod.

Daniel Chýlek

09/29/2022, 9:03 AM

sorry I haven't been active in the past few months, I've been really busy with things and we still don't have several dependencies (like postgres) running in production; I'll try to get the new docker setup working in a few weeks

2 Views

Open in Slack

Previous Next