Has anyone tried to load the <GoodData.CN> k8s ins...
# gooddata-cn
v
Has anyone tried to load the GoodData.CN k8s installation using Docker Desktop Kubernetes? I know that the instructions say it requires at least 3 nodes... and Docker Desktop I think is limited to a single node installation? Should I just use the community edition for local development?
r
Hi Vincil, it's possible to run on single node (for test purposes, of course), but there are some tweaks that need to be done in chart values, both for pulsar and for gooddata-cn
I don't recall what exactly needs to be set, but basically you need to disable antiAffinity
v
whoah nice!
I am new to k8s but learning
r
some components insist on running on a different node. But it is possible to convince it to run on a same node where other pod of the same type is already running
v
ok, and that would be overridable in the helm chart config
somehow
r
you may also want to set
replicaCount: 1
so save resources on your host.
👍 1
v
if you come across any hints, let me know here... I will continue to investigate
r
yes, everything you need can be set by extra custom values.yaml.
v
trying to make the dev environment as much like prod as possible... and Docker Desktop is the tool of choice for the moment...
appreciate the help...I will look
r
I'd rather recommend k3d
v
ok... I will look
r
k3d allows you running multi-node k8s cluster within docker containers so you don't need to tweak antiaffinity at all - bc you'll actually have 3 k8s worker nodes 😉
v
it looks like it's geared towards development and that's awesome
I was struggling with minikube... exposing services, etc pulling images in... didn't feel comfortable for an app dev workflow
r
Check out my repo: https://github.com/mouchar/gooddata-cn-tools/tree/master/k3d It's somehow outdated (I don't have much time to maintain it) but you should get a basic insight how it works and what needs to be done
v
whoah nice!!!
being new to k8s anything helps
r
It's targeted to k3d 4.x - the new 5.x versions have different structure of cmdline parameters, so it will not work out of the box.
I was running and developing it on Linux. As far as I know, Docker on MacOS or Windows has some pecularities so may be the script will need some changes.
v
awesome, I will see what I can do
thanks so much!
This is the error I am facing with pulsar/zookeeper 2.9.2 now... undoubtedly something in pulsar did not start or it can't find something?
Copy code
2023-03-19T16:02:00,455+0000 [QuorumConnectionThread-[myid=1]-1] WARN  org.apache.zookeeper.server.quorum.QuorumCnxManager - Cannot open channel to 2 at election address pulsar-zookeeper-1.pulsar-zookeeper.pulsar.svc.cluster.local:3888
java.net.UnknownHostException: pulsar-zookeeper-1.pulsar-zookeeper.pulsar.svc.cluster.local
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:229) ~[?:?]
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
 at java.net.Socket.connect(Socket.java:609) ~[?:?]
 at org.apache.zookeeper.server.quorum.QuorumCnxManager.initiateConnection(QuorumCnxManager.java:383) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
 at org.apache.zookeeper.server.quorum.QuorumCnxManager$QuorumConnectionReqThread.run(QuorumCnxManager.java:457) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
 at java.lang.Thread.run(Thread.java:829) [?:?]
These are the custom values I am using, and this is on Docker Desktop k8s, not k3s
Copy code
affinity:
  anti_affinity: false
in Pulsar is getting me a little farther... not failing... or will fail with different error LOL
r
The error above looks like the zookeeper component pods didn't start. That might be caused by anti-affinity, when you have less than 3 nodes (e.g. if you have just one worker node, 1st pod is scheduled on it, but the 2nd and 3rd can not be placed on the same node - that's basically what anti-affinity is supposed to do). Btw, if you plan to run k8s cluster locally within docker desktop on mac/win, be aware you'll need to allocate a plenty of resources to docker VM - cpu, ram and sufficient disk. Otherwise, the pods will remain uscheduled due to resource starvation - you may tweak pod resource requests/limits, but only to some extent. That's why I was suggesting to set replicas to one.
v
ah nice... I've been playing with the settings, and the failure was caused by some errors that I had introduced... I actually have pulsar almost completely starting, except for the pulsar-broker-0, which is just in a pending state :
Copy code
ContainersNotInitialized
containers with incomplete status: [wait-bookkeeper-ready]
Here are my docker resources:
r
Hm, that's probably related to older pulsar helm chart, that needs
--set initialize=true
to be passed during helm install (or add
initialize: true
into your customized-values-pulsar.yaml file. Newer versions do not need this setting as the chart automatically recognizes whether it is being installed or upgraded. This setting will add two extra k8s Jobs that will initialize bookkeeper and pulsar clusters.
But it should not be an issue for decently new pulsar. IDK what version are you trying to install, i recommend using
2.9.4
If you're trying to follow my steps in repo, it's basically ok as a general guidance, but some versions need to be updated (there's pulsar 2.7.2 and gooddata-cn 1.5.0 - a really old versions these days).
v
Ah yes, am just using the repo as a guide... taking some from the good data install instructions as well...I tried with 2.9.4 and the same result... let me try to pass the initialize true?
I've traced it to broker.configData.managedLedgerDefaultEnsembleSize, if there is only one replica... I think that number should be 1? I can get the wait-bookkeeper ready init code to run from another container successfully, but it still seems to not complete...continuing to try
Copy code
bin/apply-config-from-env.py conf/bookkeeper.conf; until bin/bookkeeper shell whatisinstanceid; do
  echo "bookkeeper cluster is not initialized yet. backoff for 3 seconds ...";
  sleep 3;
done; echo "bookkeeper cluster is already initialized"; bookieServiceNumber="$(nslookup -timeout=10 pulsar-bookie | grep Name | wc -l)"; until [ ${bookieServiceNumber} -ge 1 ]; do
  echo "bookkeeper cluster pulsar isn't ready yet ... check in 10 seconds ...";
  sleep 10;
  bookieServiceNumber="$(nslookup -timeout=10 pulsar-bookie | grep Name | wc -l)";
done; echo "bookkeeper cluster is ready";
bumping the resources wastefully high, to 1.0 CPU and 1024Mi RAM, and the
broker.configData.managedLedgerDefaultEnsembleSize: "1"
seemed to have gotten everything working!
all greens in the pulsar namespace now, thanks!!!
r
Ah, right, when you change number of replicas of bookkeeper, you need to change some parameters. 1CPU/1Gi is too high - the settings in https://github.com/mouchar/gooddata-cn-tools/blob/master/k3d/k3d.sh#L288-L334 should work for small-scale deployments.
v
sure... I can probably lower it... I tried to start the GoodData-cn helm, and there are all kinds of issues running it locally... had to back off of that for now, and will need to revisit in the future
but really appreciate all your help... I learned a lot, and I know I will get the good data helm working soon
r
Glad to help, just let us know when you get back to it.
v
yessir, so much appreciated!
@Robert Moucha had some more thoughts on all of this... when trying to run the helm charts manually... in a local instance of K8s/Docker Desktop... could it be a processor arch mismatch? I am on a Mac/M1... and I kept working in the cloud with the config... and am getting some failures on the charts when accidentally loading against a gravitron/arm64 arch.
r
Yes, definitely it is the problem you're facing. Both gooddata-cn and pulsar images are amd64 only. Aarch64 is currently supported only in gooddata-cn-ce thanks to ugly hack I implmented to make Pulsar work on M1.
v
haha I remember that and it was much appreciated 🙂
r
graviton cpus will have the same problem
v
sure... and it was an accident...I gave up on gravitron unfortunately
now am on the righteous path...and in EKS... I have 3 nodes configured... not trying to deploy on only a single node
each node is t3.medium with 4GB RAM and 2vcpus
r
It's not a problem for us to start building multi-arch images for gooddata-cn. But we're still blocked by the lack of pulsar images for aarch.
v
sure... that can be a problem for another day
how long should pulsar take to start?
r
t3.medium is too weak. I don't recommend using t3 burstable instances
v
ok
Amazon EC2 T3 instances are the next generation burstable general-purpose instance type
so t3.large?
or another class
r
no t3 or t3a. the problem with burstable instances is that they have too low baseline performance
v
ah sorry... I read your response wrong
compute or memory optimized?
r
what gooddata version do you plan to install? 2.3.0?
v
yessir
r
2.3.0 contains service for PDF exports, and it is memory and cpu intensive. https://www.gooddata.com/developers/cloud-native/doc/2.3/deploy-and-install/cloud-native/requirements/ try 3x c6a.xlarge
v
ugh accidentally reading old doc version