Hello i'm currently installing Good Data CN free v...
# gd-beginners
m
Hello i'm currently installing Good Data CN free version using kubernetes embedded on docker desktop, i'm already pull required docker image but now i'm stuck on step where i need to install helm chart from pulsar if i'm not mistaken. My question is what's the next step to deploy good data CN on kubernetes from docker desktop?
m
Hello Michael. Thank you for the question. The Free version is production version meant to run on K8S cluster. We have a different version for Docker Desktop and it is the Community Edition. The Community Edition is for evaluation purposes and for developers developing for the production version. It is a single docker image you can run on your desktop using Docker Desktop. It contains all the functionalities of the production version. Running the Free version on local is theoretically possible, but it is not meant to run on your local.
Here is the link for downloading the community edition: https://www.gooddata.com/developers/cloud-native/
m
is there any different between the production version and community edition? So, if i want to install the production version, i must provide the kubernetes environment through cloud providers?
m
From feature point of view there is no difference.
It is correct, if you decide to use the GoodData.CN in production, you will install the production version on your K8S cluster via a cloud provider or on your K8S cluster in private cloud.
If your use case is now to evaluate GoodData.CN from feature point of view, go with the Community Edition. If you want to evaluate the production deployment, you have to have a K8S cluster - your own or e.g. in AWS.
Installation of the community edition is quick and easy just
Copy code
docker pull
and
Copy code
docker run
https://www.gooddata.com/developers/cloud-native/doc/1.1/installation/aio/
I hope it is more clear now.
m
for the community edition, i'm understand because i had already explored it before, but the case right now i want to install the production version on my local computer and explore it first before deploy on cloud service
https://www.gooddata.com/developers/cloud-native/doc/1.1/installation/k8s/environment/on-premise/ because what i see in here that i can deploy on premises infrastructure
because on docker desktop i can enable the kubernetes and pull the required images but i'm currently stuck on this step https://www.gooddata.com/developers/cloud-native/doc/1.1/installation/k8s/helm-chart-installation/
m
Yes, the GoodData.CN Free can be run on-premise, but what it means is that on on-premise K8S cluster which you have in your private cloud.
I know that our developers use K3S to run it on local, but it is very complicated to get it running. We did not try it on Docker Desktop and I have some feeling that we discussed that and it is hardly possible. Let me check with our developers.
@Michael Ardhyanto I am sorry, but with Docker Desktop you do not have load balancer and volumes, so you won’t be able to get the GoodData.CN Free working. You may try to use k3s or k3d on local. It should work, but we do not have it documented. It is not the use case we have currently covered.
m
For deployment on AWS i saw on the docs that the K8S cluster needs Aurora RDS, does that mean the Aurora RDS for postgreSQL?
r
Hi Michael, yes, select the Postgresql type of Aurora RDS
m
can you provide me step by step for installing Good Data CN from kubernetes GCP? i'm still kinda new for this kind of thing
m
m
i'm kinda stuck on setting up load balancer and ingress controller
r
Ingress controller is installed by the following commands:
Copy code
helm repo add ingress-nginx <https://kubernetes.github.io/ingress-nginx>

helm -n ingress-nginx install ingress-nginx ingress-nginx/ingress-nginx --set controller.replicaCount=2 --create-namespace
It should automatically create load balancer in your GCP account
Public IP of the load balance can be found using command:
kubectl -n ingress-nginx get svc ingress-nginx-controller
(see the
EXTERNAL-IP
value)
Or you can get this value programmatically using
kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
Then you'll need to add DNS records for hostnames that will be used in your deployment. We recommend to install and configure
external-dns
into your cluster, but you can also manage DNS records manually.
m
i want to ask where i can get the file for
Copy code
customized-values-pulsar.yaml
r
store this file locally on the machine where you will run the
helm
command from, typically your personal compuuter.
you will need to write this file manually
don't forget to update the file contents according to your setup
e.g. set
storageClassName
values to storage class you want to use, for GCP it may be
standard-rwo
m
where i can setup dex identity provider setting?
r
There's not much to be set - you only need to set some hostname that will be used for Dex's Ingress and also TLS secret name. If you are using cert-manager for automated certificate provisioning, don't forget to add cert-manager's annotation. All these thing should be set during the installation of the GoodData.CN helm chart, as explained in https://www.gooddata.com/developers/cloud-native/doc/1.1/installation/k8s/helm-chart-installation/#dex-identity-provider-settings
m
hello there, i made the installation progress to this step
Copy code
helm install --version 1.1.1 --namespace gooddata-cn --wait \
  -f customized-values-gooddata-cn.yaml gooddata-cn gooddata/gooddata-cn
and i'm getting this error Error: failed pre-install: timed out waiting for the condition
r
This error means the installation fails to connect to Apache Pulsar. Please check that Pulsar deployment made in previous step is in a good shape (all pods are running). also check logs in init container and container of job gooddata-cn-create-namespace this job is responsible for configuring pulsar namespace for the GoodData.CN deployment
m
when executed the job seems created but i when i check it, its return empty
does it related to this issue? when i check to cluster workload on GCP i found some workload got status does not have minimum availability
r
there are several issues
1. cannot connect to Pulsar for some reason. Please check if the pulsar deployment is OK and you have properly set pulsar-related values in gooddata-cn (service.pulsar.host, service.pulsar.namespace) 2. you probably have misconfigured or inaccessible external postgresql database
m
how much does the pulsar-bookie job needs to? i change the requested value from 0.2 to 1.0 but still insufficient CPU
r
not much - we're running a small cluster with:
Copy code
bookkeeper:
  configData:
    PULSAR_MEM: >
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
  replicaCount: 3
  resources:
    requests:
      cpu: 0.2
      memory: 128Mi
m
okay, i just configured what you said about postgre database now when i run the kubectl i got error helm.go81 [debug] template: gooddata-cn/templates/tools/deployment.yaml3516: executing "gooddata-cn/templates/tools/deployment.yaml" at <include "gooddata-cn.redisEnv" .>: error calling include: template: gooddata-cn/templates/_redis.tpl484: executing "gooddata-cn.redisEnv" at <fail "Multiple redis hosts defined in non-clustered mode.">: error calling fail: Multiple redis hosts defined in non-clustered mode.
r
This error means you have incorrectly configured redis connection. May I ask you how did you installed the external redis cache? I assume you followed the steps in https://www.gooddata.com/developers/cloud-native/doc/1.1/installation/k8s/environment/gcp/. What redis-related setting you have in your custom values.yaml for gooddata-cn helm chart? There should be something like:
Copy code
service:
  redis:
    hosts:
      - <http://IP.AD.RE.SS|IP.AD.RE.SS>
    port: 6379
    clusterMode: false
The
<http://IP.AD.RE.SS|IP.AD.RE.SS>
should point to your redis endpoint. Note that there should be only one
hosts
record if you don't run redis in cluster mode.
m
i'm installing the redis and postgre through GCP, so i search memorystore for redis, and postgreSQL, when the instance made i fill the host with IP address from the redis
@Robert Moucha this is my yaml file and from GCP platform
r
the yaml value of
hosts
has wrong type - you entered it as a string but it needs to be a list of strings. in your particular case it should look like:
Copy code
service:
  redis:
    hosts:
      - 10.251.96.115
    port: 6379
    clusterMode: false
m
@Robert Moucha thank you for the answers the command are now working on progress, which is leaves to figures out the timeout error. This is the screenshot from pulsar-bookie logs
r
Unchedulable pods suggests there are some unmet requirements: • you need at least 3-node k8s cluster (some pulsar components, like bookie and zookeeper, must run with 3 replicas and there's antiaffinity set so each replica runs on a different k8s worker node) • check you have some storage class configured and available for use (kubectl get sc) • make sure this storageClassName name is properly set in customized values for Pulsar (see https://www.gooddata.com/developers/cloud-native/doc/1.1/installation/k8s/helm-chart-installation/#use-customized-valuesyaml-for-pulsar)
and check the details above in GCP console to identify the reason why the pods are unschedulable
m
• First image show in my cluster already set 3 nodes • second image shows that i used storage class standard-rwo • third image config from customized values for pulsar
r
It looks good, except of the fact the e2-medium is too small, it has just 1 vCPU (Google says it's 2 vCPU, but each core gives you just 50% of time, so it's actually 2 * 50% = 100% = 1 vCPU). I'd recommend to switch to
e2-standard-2
or better. Did you check the the reasons why the pods in pulsar-bookie statefulset are Unschedulable?
as there's just 107 milliCores remaining CPU capacity on the third node, my wild guess is that some pods don't fit the cluster.
m
@Robert Moucha the reason that pulsar-bookie are unscheduled because insufficient CPU. I will try to resize the cluster so it can fit
👍 1
do i need to delete the e2-medium node pools? and start over from scratch?
i just create a new node pools with e2-standard-2
r
you could simply uninstall both helm charts, add new nodes and remove the e2-medium nodes.
m
even with e2-standard still issue on insufficient CPU
r
strange. I'm just deploying the 3-node cluster the most similar to yours
Well, I think I know the cause. Please update your
customized-values-gooddata-cn.yaml
file with a correct username for database user. In documentation we have
postgres@gooddata-cn-pg
but correct value is just
postgres
Sorry for that, the
postgres@gooddata-cn-pg
is valid only for Azure cloud. GCP needs just
postgres
image.png
m
that was strange, i check the debug the error still same out of 2 expected nodes
r
What do you mean? That fixing the username in customized-values-gooddata-cn.yaml didn't help?
m
@Robert Moucha wait let me check first for the configuration
@Robert Moucha the error still occur
r
the username in the 2nd screenshot is wrong
it should be just
postgres
and not
postgres@postgresqlgooddata
m
@Robert Moucha just found out the root cause for that
helm install --version 1.1.1 --namespace gooddata-cn --wait \
-f customized-values-gooddata-cn.yaml gooddata-cn gooddata/gooddata-cn
it's must be used without
--wait
but when i'm checking via command
helm list
it doesn't show the release
seems like the dex cannot conect to postgresql db
r
helm list
will only show releases in your default namespace. If you don't specify one, it's the
default
namespace. So you need to use
helm list -n gooddata-cn
(or you can run
helm list -A
to print all releases in all namespaces).
As far as the inaccessible PostgreSQL server is concerned, your server seems to have only Public IP address. You need to have the server accessible from your k8s cluster, i.e. from the internal network.
Please modify your cloud sql instance and allow access from private IP network where your k8s cluster is deployed.
m
@Robert Moucha this one?
r
Yes. Set this checkbox, choose the network and save. It may take some time to apply. Then, you'll see a private IP address available
and then replace the current (public) IP with a new (private) IP in your custom values yaml
and run
helm -n gooddata-cn upgrade -f path/to/your/custom-values.yaml gooddata-cn gooddata/gooddata-cn
(please update this command according to your setup - namespace, path, release name and helm repo name)
m
@Robert Moucha okay i will try to change it
@Robert Moucha thank you for your previous guidance, now the helm chart installation is completed and i'm already manage the organization until create bootstrap token to hit the API but return connection time out
r
Hi, the
connection timed out
suggests there network issue and your request can not reach the load balancer for some reason. Your organization hostname now resolves to 34.101.200.187. Check if this address is the same as the
EXTERNAL-IP
of the
ingress-nginx-controller
Service resource.
m
currently there's 2 ingress nginx controller in my services
ingress-nginx-controller
the one i made using
helm -n ingress-nginx install ingress-nginx ingress-nginx/ingress-nginx \
--set controller.replicaCount=2
and nginx is the ingress that i made from external dns tutorial, but i saw the one i made from tutorial only listen to port 80
so i tried to move the DNS from 34.101.200.187 to 34.101.188.165 since the ingress listen to port 80 and 443
seems the domain already moved to new IP but i can't curl because SSL error
@Robert Moucha i got this error when trying to create user from dex
r
ingress controller made from external dns tutorial (called
nginx
) will not be used, you can safely delete it and save some $$$
the certificate issue means you didn't provided the TLS certificate for your organization
there's self-signed certificate directly in ingress-controller that works as a last resort option for ingresses without its own certificate.
However, you might want to integrate Let's Encrypt service via cert-manager and get publicly trusted TLS certificates for free.
m
on my
organization.yaml
i'm not using tls because i thought i'm using wildcard
r
there's also your own self-signed certificate on auth.logicnesia.com endpoint. I recommend to follow https://www.gooddata.com/developers/cloud-native/doc/1.1/installation/k8s/considerations/ingress-cert-manager/ and integrate with Let's encrypt, as it is very convenient and future-proof method. If you want to use your own certificate (you can), then you must set tls section in your organization and set:
Copy code
spec:
  tls:
    secretName: secret-with-your-wildcard-cert-and-key
The mentioned secret must exist before modyfing the organization. It will be passed to k8s API when provisioning organization's Ingress.
m
so after i add the cert manager , i must update my organization yaml adding the tls and then
kubectl apply -f organization.yaml
?
r
First, you need to create
ClusterIssuer
, as described in the documentation above. Then, update your organization.yaml so it contains:
Copy code
spec:
  tls:
    secretName: secret-name-that-cert-manager-will-use
    issuerName: letsencrypt-prod  # update depending your CertIssuer name
    issuerType: ClusterIssuer
Then apply the organization.yaml using kubectl
When you have cert-manager integrated and CertIssuer in place, you may also want to replace your current self-signed certificate for authHost endpoint with certificate automatically managed by cert-manager. To do so, add the following helm values to a proper place in your customized-values YAML file:
Copy code
dex:
  ingress:
    authHost: <http://auth.logicnesia.com|auth.logicnesia.com>                       # you already have this
    annotations:
      <http://cert-manager.io/cluster-issuer|cert-manager.io/cluster-issuer>: letsencrypt-prod  # update depending your CertIssuer name
    tls:
      authSecretName: gooddata-cn-auth-secret           # or use any other name, it's up to you
and upgrade helm chart release using
helm upgrade ...
command to reconfigure auth Ingress and let cert-manager to request a new certificate. Then, your browser should not complain on untrusted TLS certificates
m
how to add the first step?
@Robert Moucha currently i already setup the CAA record on my domain (which is on cpanel), already create ClusterIssuer and update on my
organization.yaml
changing the annotations on
customized-values-gooddata-cn.yaml
and execute
helm upgrade
but the error still same
r
A single CAA record should be set on the domain level (
<http://logicnesia.com|logicnesia.com>.
in your case) and not on every single hostname within that domain. But it is a minor issue. But you are using the same secret name both for dex.tls.authSecretName in values.yaml and for spec.tls.secretName in your Organization. Please use a unique name for every organization, because each Org. will be provisioned with a different TLS certificate. And one more thing - your license.key is too short - check your e-mail where you have received your license key. The
license.key
is the long string starting with
key/
text.
m
@Robert Moucha just to be clear • tls for file
customized-values-gooddata-cn.yaml
with secret using the authhost private key and crt for
<http://auth.logicnesia.com|auth.logicnesia.com>
• tls for file
organization.yaml
with secret using host private key and crt which is the
<http://staging-dashboard.logicnesia.com|staging-dashboard.logicnesia.com>
r
exactly.
Secret for TLS of organization will be created by cert-manager. But the name must be given on Organization resource. And definitely it must not be the same name as you already used for the dex TLS. These are different secrets
m
this is my customized for gooddata-cn, next one the organization.yaml.
r
Ok, the names of secrets differ, it looks better.
m
still same error when curl to create user
r
Would you mind performing the following steps and send me the support bundle (directly to me)? See https://gist.github.com/mouchar/c9e53714ef8cd95a08fdb5d234bf1898
So, I briefly checked the support bundle and identified the following issues: 1. auth.logicnesia.com and staging-dashboard.logicnesia.com have different IP addresses. My guess is that the auth.... has wrong IP, pointing to some cPanel default backend. This needs to be fixed first. 2. That's why the cert-manager didn't issued the certificate, because HTTP01 solver running for auth.logicnesia.com doesn't work and authenticity cannot be proven. This should fix automatically, but it may take some time (I don't know how often the letsencrypt validates sites). 3. the managed-logicnesia ingress doesn't exist for some reason. I don't know why. But the easiest fix will be to delete organization (
kubectl -n gooddata-cn delete org logicnesia
) and create it again using
kubectl apply -f organization.yaml
4. domain filter set on external-dns points to bogus domain
--domain-filter=<http://test-logicnesia.com|test-logicnesia.com>
it means that it will not work and you must manage hostnames manually. It probably relates to issue 1.
m
i just saw new detail on my load balancer
@Robert Moucha so where the auth.logicnesia.com IP must be placed? same with staging-dashboard.logicnesia.com?
r
I wonder how did you created this LB? When I install
ingress-nginx
on my fresh GKE cluster using command:
Copy code
helm -n ingress-nginx install ingress-nginx ingress-nginx/ingress-nginx \
    --set controller.replicaCount=2 --create-namespace
the created LB is of TCP type, not HTTP:
m
@Robert Moucha i'm not creating the HTTP LB when i check the detail it's route to auth.logicnesia.com
the one that i made using that command was here and the type already TCP
r
for some reason, the Service
logicnesia-gooddata-cn-dex
is annotated with
<http://cloud.google.com/neg|cloud.google.com/neg>
annotation. It causes the service being exposed via GLBC (google's LB controller)
m
i just managed the DNS auth domain from the hosting services according to
kubectl -n gooddata-cn get ing
so the auth and the staging-dashboard IP must be same?
and i'm already reroute the domain filter set on external DNS
r
yes, now it looks better - the auth endpoint is managed by the proper (TCP) LB and it has a valid letsencrypt certificate
the 2nd Ingress (staging-dashboard) doesn't exist (as I wrote before). In order to fix it, delete the organization resource and create it again by applying your
organization.yaml
file. It should create the ingress as well
auth and all organization ingresses (incl. staging-dashboard) should have the same IP address because they are handled by the same ingress controller and the same TCP load balancer.
m
when i tried to delete the org, the process was too slow until the console got disconnected
r
the deletion should be fast. If it was stuck, it mean something was wrong during removal. Check if the organization still exists (
kubectl -n gooddata describe org logicnesia
). If yes, I will tell you how to delete it the hard way
m
still exist
r
could you please send me the "Status" section from the describe output?
m
image.png
r
OK, in order to delete this faulty organization, first run:
Copy code
kubectl -n gooddata-cn patch organization logicnesia --type json --patch='[{"op":"remove","path":"/metadata/finalizers"}]'
and then the organization can be deleted by kubectl delete
m
this mean i can apply the org file right?
r
yes
m
the ingress still not created
r
I created bug report in order to make this component more resilient. There's some internal state incoherence between database and k8s state. It's fixable, but as you still don't have any data in your deployment, the easiest way would be to delete the gooddata-cn and start from scratch. So: • delete the organization (maybe the kubectl patch may be necessary, if the kubectl delete will take more than few seconds) • delete the helm release • connect to postgres database and perform:
Copy code
drop database dex;
drop database md;
• install the helm chart (with the previous customized-values) • wait until all pods come up • create the organization using your organization.yaml
m
i can't delete this ingress for dex
r
this ingress will be deleted together with the helm release uninstall
m
i have already uninstall pulsar,ingress-nginx and gooddata-cn but this ingress still remain
r
You didn't have to uninstall pulsar or ingress-nginx. Sorry I didn't mentioned it specifically, but uninstalling just gooddata-cn would be fine.
manual removal doesn't work either? kubectl -n gooddata-cn delete ingress logicnesia-gooddata-cn-dex This ingress has some strange IP address
m
the IP redirect to new ingress nginx controller IP since i reinstall the ingress-nginx
@Robert Moucha without that ingress uninstalled i can't install the gooddata-cn
this is the result when i tried to install the gooddata-cn
r
of course you can't, because there would be conflicts
you need to clean the previous state
m
okay i will try to uninstall the ingress for dex first
r
there will be pending finalizers from LB, probably
Copy code
kubectl -n gooddata-cn patch ingress logicnesia-gooddata-cn-dex --type json --patch='[{"op":"remove","path":"/metadata/finalizers"}]'
before you start installing everything again, make sure there's no other LB attached. If you already installed ingress-nginx, you should see just one TCP LB.
m
okay there's only 1 LB that i reinstalled
r
the screenshot is after the reinstallation of everything? pulsar, nginx, gooddata?
your IP addresses of auth and staging-dashboard still point ot the old nonexistent LB
but you have upsert-only policy set in your external-dns so it might be ok
m
yes, i will reconfigure the IP from the hosting service
r
checking the external-dns again, the domain filter points to staging-dashboard.logicnesia.com, instead of just
<http://logicnesia.com|logicnesia.com>
so it won't automatically manage your IP addresses anyway
m
change the domain filter to logicnesia.com?
r
are you sure your DNS hosting is ever supported by external-dns? According to SOA record of the logicnesia.com domain, it's not provided by Google DNS, but some indonesian provider Niagahoster. So I believe this will not work anyway, unless Niagahoster has some API supported by external-dns.
so you will need to update your dns records manually, I'm afraid.
m
yes the domain from niagahoster, right now i manually configure the IP from their DNS management
this is their portal DNS management
@Robert Moucha is this the expected output?
r
I see. It's a pity if it is not supported - it would simplify the operation. You will need to write your own tooling instead of external-dns. So the A records are in place.
Ingresses may take a while until they are propagated to LB
but basically yes, it is expected output
m
now i try to hit the API to create user?
r
Now you can proceed with user creation via API and token
m
seems like i can't access the URL via browser
when i try to create user via API it's still processing
wait why it's still connect to old IP?
r
It works from me
maybe you have stale DNS cache
you're using windows, right?
I think there was some command to flush dns cache manually
m
as soon as i flush the DNS it's works
👍 1
@Robert Moucha thank you for your help! i can't do it without your guidance 👍
r
I'm glad to help. It was worthy for us as well to see what issues could our customers be facing.
m
👍