Hi everybody may be < Robert Moucha> can help on this one I GoodData #gooddata-cn

Hi everybody, may be <@U021NRNA866> can help on th...

Reda Ouafi

11/21/2022, 10:21 AM

Hi everybody, may be @Robert Moucha can help on this one. I have tested all the suggestions provided in this channel regarding pulsar not starting pods but no chance. Here you can see the pulsar pod logs

Copy code

│ wait-zookeeper-ready Address:    10.9.128.10#53                                                                                                                                                          │
│ wait-zookeeper-ready                                                                                                                                                                                     │
│ wait-zookeeper-ready ** server can't find pulsar-zookeeper-2.pulsar-zookeeper.pulsar: NXDOMAIN                                                                                                           │
│ wait-zookeeper-ready                                                                                                                                                                                     │
│ stream logs failed container "pulsar-bookie-init" in pod "pulsar-bookie-init-k7872" is waiting to start: PodInitializing for pulsar/pulsar-bookie-init-k7872 (pulsar-bookie-init)                        │
│ stream logs failed container "pulsar-bookie-init" in pod "pulsar-bookie-init-k7872" is waiting to start: PodInitializing for pulsar/pulsar-bookie-init-k7872 (pulsar-bookie-init)                        │
│ stream logs failed container "pulsar-bookie-init" in pod "pulsar-bookie-init-k7872" is waiting to start: PodInitializing for pulsar/pulsar-bookie-init-k7872 (pulsar-bookie-init)                        │
│ wait-zookeeper-ready Server:        10.9.128.10                                                                                                                                                          │
│ wait-zookeeper-ready Address:    10.9.128.10#53                                                                                                                                                          │
│ wait-zookeeper-ready                                                                                                                                                                                     │
│ wait-zookeeper-ready ** server can't find pulsar-zookeeper-2.pulsar-zookeeper.pulsar: NXDOMAIN                                                                                                           │
│ wait-zookeeper-ready                                                                                                                                                                                     │
│ stream logs failed container "pulsar-bookie-init" in pod "pulsar-bookie-init-k7872" is waiting to start: PodInitializing for pulsar/pulsar-bookie-init-k7872 (pulsar-bookie-init)

Pulsar version : 2.9.2 Environment : GKE without autopilot Helm config :

Copy code

components:
    functions: false
    proxy: false
    pulsar_manager: false
    toolset: false
  monitoring:
    alert_manager: false
    grafana: false
    node_exporter: false
    prometheus: false
  images:
    autorecovery:
      repository: apachepulsar/pulsar
    bookie:
      repository: apachepulsar/pulsar
    broker:
      repository: apachepulsar/pulsar
    zookeeper:
      repository: apachepulsar/pulsar
  zookeeper:
    volumes:
      data:
        name: data
        size: 2Gi
        storageClassName: standard-rwo
    replicaCount: 3
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
  bookkeeper:
    configData:
      PULSAR_MEM: >
        -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
    metadata:
      image:
        repository: apachepulsar/pulsar
    replicaCount: 3
    resources:
      requests:
        cpu: 0.2
        memory: 128Mi
    volumes:
      journal:
        name: journal
        size: 5Gi
        storageClassName: standard-rwo
      ledgers:
        name: ledgers
        size: 5Gi
        storageClassName: standard-rwo
  pulsar_metadata:
    image:
      repository: apachepulsar/pulsar
  broker:
    # this setting is recommended to automatically apply changes in the configuration to the broker
    # uncomment the following line to turn it on
    # restartPodsOnConfigMapChange: true
    configData:
      PULSAR_MEM: >
        -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
      subscriptionExpirationTimeMinutes: "5"
      webSocketServiceEnabled: "true"
      systemTopicEnabled: "true"
      topicLevelPoliciesEnabled: "true"
    replicaCount: 3
    resources:
      requests:
        cpu: 0.2
        memory: 256Mi

Robert Moucha

11/21/2022, 11:50 AM

Hello Reda, thank you for reaching out. The error above suggests that Zookeeper stateful set is not in good shape. This component starts first in order to allow other parts of pulsar cluster to proceed with initialization. Your configuration looks good, assuming the

standard-rwo

storageclass is available in your cluster. Can you please check status of all 3 zookeeper pods (pulsar-zookeeper-0,1,2)?

Robert Moucha

11/21/2022, 11:54 AM

pay extra attention to status of mounted volume

pulsar-zookeeper-data

- failed mount is the most common cause of issues.

Reda Ouafi

11/21/2022, 1:45 PM

I don't even see pulsar-zookeeper pods

Robert Moucha

11/21/2022, 1:45 PM

hmm, the statefulset exists?

Reda Ouafi

11/21/2022, 1:45 PM

when I dig into my pulsar pod, I see these containers

Robert Moucha

11/21/2022, 1:47 PM

are u sure you can see all pods in k9s? try pressing ctrl-z

Reda Ouafi

11/21/2022, 1:48 PM

even with ctrl+Z

Reda Ouafi

11/21/2022, 1:48 PM

image.png

Reda Ouafi

11/21/2022, 1:49 PM

running get pods on the pulsar namespaces, these are the only pods that are running

Robert Moucha

11/21/2022, 1:49 PM

how about

kubectl -n pulsar get sts

Robert Moucha

11/21/2022, 1:50 PM

you should see

pulsar-zookeeper

sts created

Reda Ouafi

11/21/2022, 1:51 PM

No resources found in pulsar namespace.

Reda Ouafi

11/21/2022, 1:51 PM

that's strange

Robert Moucha

11/21/2022, 1:51 PM

indeed

Robert Moucha

11/21/2022, 1:52 PM

🤦‍♂️

Robert Moucha

11/21/2022, 1:52 PM

Wrong indentation in yaml config

Reda Ouafi

11/21/2022, 1:52 PM

which chart should create the sts ? the pulsar one right ?

Robert Moucha

11/21/2022, 1:53 PM

yes, zookeeper is a part of apache pulsar

Robert Moucha

11/21/2022, 1:53 PM

please fix your pulsar yaml

Robert Moucha

11/21/2022, 1:53 PM

all keys from "monitoring" and below are overindented

Reda Ouafi

11/21/2022, 1:53 PM

do you have an example yaml please ?

Robert Moucha

11/21/2022, 1:54 PM

https://www.gooddata.com/developers/cloud-native/doc/2.1/deploy-and-install/cloud-native/helm-chart-installation/#use-customized-[…]syaml-for-pulsar

Robert Moucha

11/21/2022, 1:54 PM

But please copy-paste carefully

Robert Moucha

11/21/2022, 1:56 PM

Copy code

components:
  functions: false
  proxy: false
  pulsar_manager: false
  toolset: false
monitoring:
  alert_manager: false
  grafana: false
  node_exporter: false
  prometheus: false
images:
  autorecovery:
    repository: apachepulsar/pulsar
  bookie:
    repository: apachepulsar/pulsar
  broker:
    repository: apachepulsar/pulsar
  zookeeper:
    repository: apachepulsar/pulsar
zookeeper:
  volumes:
    data:
      name: data
      size: 2Gi
      storageClassName: standard-rwo
  replicaCount: 3
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
bookkeeper:
  configData:
    PULSAR_MEM: >
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
  metadata:
    image:
      repository: apachepulsar/pulsar
  replicaCount: 3
  resources:
    requests:
      cpu: 0.2
      memory: 128Mi
  volumes:
    journal:
      name: journal
      size: 5Gi
      storageClassName: standard-rwo
    ledgers:
      name: ledgers
      size: 5Gi
      storageClassName: standard-rwo
pulsar_metadata:
  image:
    repository: apachepulsar/pulsar
broker:
  # this setting is recommended to automatically apply changes in the configuration to the broker
  # uncomment the following line to turn it on
  # restartPodsOnConfigMapChange: true
  configData:
    PULSAR_MEM: >
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
    subscriptionExpirationTimeMinutes: "5"
    webSocketServiceEnabled: "true"
    systemTopicEnabled: "true"
    topicLevelPoliciesEnabled: "true"
  replicaCount: 3
  resources:
    requests:
      cpu: 0.2
      memory: 256Mi

Robert Moucha

11/21/2022, 1:56 PM

this is your config with indentation fixed

Reda Ouafi

11/21/2022, 2:00 PM

running with the enw config

No resources found in default namespace.

Robert Moucha

11/21/2022, 2:00 PM

don't forget to set namespace, Pulsar is usually installed to

pulsar

namespace

Robert Moucha

11/21/2022, 2:01 PM

kubectl *-n pulsar* get pods

Reda Ouafi

11/21/2022, 2:49 PM

I am already there

Reda Ouafi

11/21/2022, 2:50 PM

Copy code

NAME                       READY   STATUS     RESTARTS   AGE
pulsar-bookie-init-k7872   0/1     Init:0/1   0          16h
pulsar-pulsar-init-6vs98   0/1     Init:0/2   0          18h

Reda Ouafi

11/21/2022, 2:55 PM

but zookeper should be a different pod or a container running in pulsar pod

Robert Moucha

11/21/2022, 2:57 PM

how did you reinstalled pulsar chart? can you share the exact command?

Robert Moucha

11/21/2022, 2:58 PM

I recommend uninstall the old helm release and try to install from scratch with fixed yaml values file

Reda Ouafi

11/21/2022, 3:02 PM

ok, you think that default pulsar parameters are not taken in consideration but only custom ones are getting applied

Reda Ouafi

11/21/2022, 3:02 PM

Reda Ouafi

11/21/2022, 3:05 PM

regarding installation : I am running it via ansible command

Copy code

ansible-playbook --inventory-file="$INVENTORY_FILE" helm_pulsar.yaml

INVENTORY_FILE :

Copy code

[all:vars]
stdout_callback = yaml
bin_ansible_callbacks = True
ansible_user = deployment
ansible_connection = ssh
host_key_checking = false
ansible_ssh_common_args = '-o StrictHostKeyChecking=no'
ansible_python_interpreter = python3

[tooling_cluster]
<http://mytest.gojuno.io|mytest.gojuno.io> 
ansible_connection=local

helm_pulsar.yaml :

Copy code

pulsar_deployment_version: 2.9.2
pulsar_deployment_values:
  # file name: customized-values-pulsar.yaml
  # file name: customized-values-pulsar.yaml
components:
  functions: false
  proxy: false
  pulsar_manager: false
  toolset: false
monitoring:
  alert_manager: false
  grafana: false
  node_exporter: false
  prometheus: false
images:
  autorecovery:
    repository: apachepulsar/pulsar
  bookie:
    repository: apachepulsar/pulsar
  broker:
    repository: apachepulsar/pulsar
  zookeeper:
    repository: apachepulsar/pulsar
zookeeper:
  volumes:
    data:
      name: data
      size: 2Gi
      storageClassName: standard-rwo
  replicaCount: 3
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
bookkeeper:
  configData:
    PULSAR_MEM: >
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
  metadata:
    image:
      repository: apachepulsar/pulsar
  replicaCount: 3
  resources:
    requests:
      cpu: 0.2
      memory: 128Mi
  volumes:
    journal:
      name: journal
      size: 5Gi
      storageClassName: standard-rwo
    ledgers:
      name: ledgers
      size: 5Gi
      storageClassName: standard-rwo
pulsar_metadata:
  image:
    repository: apachepulsar/pulsar
broker:
  # this setting is recommended to automatically apply changes in the configuration to the broker
  # uncomment the following line to turn it on
  # restartPodsOnConfigMapChange: true
  configData:
    PULSAR_MEM: >
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
    subscriptionExpirationTimeMinutes: "5"
    webSocketServiceEnabled: "true"
    systemTopicEnabled: "true"
    topicLevelPoliciesEnabled: "true"
  replicaCount: 3
  resources:
    requests:
      cpu: 0.2
      memory: 256Mi

Here its is

Robert Moucha

11/21/2022, 3:11 PM

helm_pulsar.yaml doesn't look like a valid ansible playbook. Is it?

Reda Ouafi

11/21/2022, 3:13 PM

yes it is, the same command ran successfully for nginx : helm_nginx.yaml

Copy code

- hosts:        tooling_cluster
  gather_facts: false
  name:         "Ensure Kubernetes Registry pull secret is present"
  roles:
  - private.kubernetes.kubeconfig
  vars:
    kubernetes_kubeconfig_content: "{{ kubeconfig_cluster }}"

- hosts: tooling_cluster
  name:  "Nginx Ingress Controller"
  vars:
    kubernetes_namespace: "ingress-nginx"
  tasks:
  - name: Create Namespace
    kubernetes.core.k8s:
      api_version: v1
      name:        '{{ kubernetes_namespace }}'
      kind:        Namespace
      state:       present

  - name: Add nginx chart repo
    kubernetes.core.helm_repository:
      name:     ingress-nginx
      repo_url: <https://kubernetes.github.io/ingress-nginx>

  - name: Deploy nginx ingress helm chart
    kubernetes.core.helm:
      chart_ref:         ingress-nginx/ingress-nginx
      chart_version:     "{{ nginx_deployment_version }}"
      release_name:      ingress-nginx
      release_namespace: "{{ kubernetes_namespace }}"
      create_namespace:  yes
      update_repo_cache: yes
      release_state:     present
      atomic:            yes
      values:           '{{ nginx_deployment_values }}'

Reda Ouafi

11/21/2022, 3:13 PM

I mean

Copy code

ansible-playbook --inventory-file="$INVENTORY_FILE" helm_nginx.yaml

Robert Moucha

11/21/2022, 3:14 PM

helm_pulsar.yaml looks completely differently, there are no tasks to be run.

Robert Moucha

11/21/2022, 3:15 PM

so you cant just run

ansible-playbook --inventory-file="$INVENTORY_FILE" helm_pulsar.yaml

helm_pulsar.yaml

is not a playbook

Robert Moucha

11/21/2022, 3:15 PM

it looks more like some set of parameters/variables for playbook

Reda Ouafi

11/21/2022, 3:16 PM

there is a task no ? this one for example

Copy code

- name: Deploy Pulsar helm chart
      kubernetes.core.helm:
        chart_ref: apache-pulsar/pulsar
        chart_version: "{{ pulsar_deployment_version }}"
        release_name: pulsar
        release_namespace: "{{ kubernetes_namespace }}"
        create_namespace: yes
        update_repo_cache: yes
        release_state: absent
        values: "{{ pulsar_deployment_values }}"
        restartPodsOnConfigMapChange: true

Robert Moucha

11/21/2022, 3:18 PM

yes but in a different file - it's in

helm_nginx.yaml

but you were running: ansible-playbook --inventory-file="$INVENTORY_FILE" helm_pulsar.yaml and helm_pulsar.yaml is NOT a playbook

Robert Moucha

11/21/2022, 3:19 PM

anyway, now I can see you have kubernetes.core.helm task defined in helm_nginx.yaml, referring values from variable

pulsar_deployment_values

Robert Moucha

11/21/2022, 3:20 PM

In that case, the indentation in helm_pulsar.yaml is wrong:

Copy code

pulsar_deployment_version: 2.9.2
pulsar_deployment_values:
  # file name: customized-values-pulsar.yaml
  # file name: customized-values-pulsar.yaml
components:
  functions: false
  proxy: false
  pulsar_manager: false
  ...etc...

Robert Moucha

11/21/2022, 3:20 PM

it should rather look like:

Copy code

pulsar_deployment_version: 2.9.2
pulsar_deployment_values:
  components:
    functions: false
    proxy: false
    pulsar_manager: false
    ...etc...

✅ 1

Robert Moucha

11/21/2022, 3:21 PM

because you need to pass all the values object to

values

parameter of kubernetes.core.helm module (using {{ pulsar_deployment_values }})

Robert Moucha

11/21/2022, 3:37 PM

according to docs, it's possible to pass customized-values-pulsar.yaml to helm module directly, without embedding values to vars file:

Copy code

- name: Deploy Pulsar helm chart
      kubernetes.core.helm:
        chart_ref: apache-pulsar/pulsar
        chart_version: "{{ pulsar_deployment_version }}"
        release_name: pulsar
        release_namespace: "{{ kubernetes_namespace }}"
        create_namespace: yes
        update_repo_cache: yes
        release_state: absent
        values_files:
          - /path/to/customized-values-pulsar.yaml

Note: the "restartPodsOnConfigMapChange: True" as specified in your playbook will not work. This value must be explicitly set to every Pulsar component that you want to automatically restart. For example:

Copy code

broker:
   restartPodsOnConfigMapChange: true
   ... other broker settings...

Reda Ouafi

11/21/2022, 9:41 PM

Thanks for your help, it's getting better I am getting this when kubectl get sts -n pulsar

Copy code

NAME               READY   AGE
pulsar-bookie      1/4     6m14s
pulsar-broker      0/3     6m14s
pulsar-proxy       0/3     6m14s
pulsar-recovery    1/1     6m14s
pulsar-toolset     1/1     6m14s

Reda Ouafi

11/21/2022, 9:41 PM

still problems to run broker and proxy

Robert Moucha

11/22/2022, 8:02 AM

this looks really strange - the ZK sts is missing (it should be there) and proxy is installed (it should NOT, because

components.proxy=false

Robert Moucha

11/22/2022, 8:05 AM

can you dump the actual values from the helm release, using

helm -n pulsar get values pulsar

? (assuming the

{kubernetes_namespace}

Ansible var has value "pulsar", if not, please update the

-n pulsar

parameter)

Reda Ouafi

11/22/2022, 8:32 AM

Yes

Reda Ouafi

11/22/2022, 8:32 AM

Copy code

pulsar_deployment_values:
  bookkeeper:
    configData:
      PULSAR_MEM: |
        -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
    metadata:
      image:
        repository: apachepulsar/pulsar
    replicaCount: 3
    resources:
      requests:
        cpu: 0.2
        memory: 128Mi
    restartPodsOnConfigMapChange: true
    volumes:
      journal:
        name: journal
        size: 5Gi
        storageClassName: standard-rwo
      ledgers:
        name: ledgers
        size: 5Gi
        storageClassName: standard-rwo
  broker:
    configData:
      PULSAR_MEM: |
        -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
      subscriptionExpirationTimeMinutes: "5"
      systemTopicEnabled: "true"
      topicLevelPoliciesEnabled: "true"
      webSocketServiceEnabled: "true"
    replicaCount: 3
    resources:
      requests:
        cpu: 0.2
        memory: 256Mi
    restartPodsOnConfigMapChange: true
  components:
    functions: false
    proxy: false
    pulsar_manager: false
    toolset: false
  images:
    autorecovery:
      repository: apachepulsar/pulsar
    bookie:
      repository: apachepulsar/pulsar
    broker:
      repository: apachepulsar/pulsar
    zookeeper:
      repository: apachepulsar/pulsar
  monitoring:
    alert_manager: false
    grafana: false
    node_exporter: false
    prometheus: false
  pulsar_metadata:
    image:
      repository: apachepulsar/pulsar
  zookeeper:
    replicaCount: 3
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
    restartPodsOnConfigMapChange: true
    volumes:
      data:
        name: data
        size: 2Gi
        storageClassName: standard-rwo
pulsar_deployment_version: 2.9.2

Robert Moucha

11/22/2022, 9:12 AM

^^ this is NOT the output of values from deployed pulsar helm release.

Robert Moucha

11/22/2022, 9:15 AM

If it is what you got from

helm get values

command I suggested, it means that the values were incorrectly passed by ansible

Reda Ouafi

11/22/2022, 9:41 AM

yes this is what I got from them helm get values

Reda Ouafi

11/22/2022, 9:42 AM

The thing is, I don't understand why some values has been taken while others not. For example the storageClassName which is something very specific : storageClassName: standard-rwo is present

Robert Moucha

11/22/2022, 10:30 AM

you need to reconfigure ansible play so that it passes values like this:

Copy code

bookkeeper:
  configData:
    PULSAR_MEM: |
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
  metadata:
    image:
    ...etc...

and not like this:

Copy code

pulsar_deployment_values:
  bookkeeper:
    configData:
      PULSAR_MEM: |
        -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
    metadata:
      image:
      ...etc...

Reda Ouafi

11/22/2022, 12:48 PM

I just did it Robert

Reda Ouafi

11/22/2022, 12:48 PM

image.png

Reda Ouafi

11/22/2022, 12:54 PM

the get values looks ok with what I've provided :

Copy code

USER-SUPPLIED VALUES:
bookkeeper:
  configData:
    PULSAR_MEM: |
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
  metadata:
    image:
      repository: apachepulsar/pulsar
  replicaCount: 3
  resources:
    requests:
      cpu: 0.2
      memory: 128Mi
  restartPodsOnConfigMapChange: true
  volumes:
    journal:
      name: journal
      size: 5Gi
      storageClassName: standard-rwo
    ledgers:
      name: ledgers
      size: 5Gi
      storageClassName: standard-rwo
broker:
  configData:
    PULSAR_MEM: |
      -Xms128m -Xmx256m -XX:MaxDirectMemorySize=128m
    subscriptionExpirationTimeMinutes: "5"
    systemTopicEnabled: "true"
    topicLevelPoliciesEnabled: "true"
    webSocketServiceEnabled: "true"
  replicaCount: 3
  resources:
    requests:
      cpu: 0.2
      memory: 256Mi
  restartPodsOnConfigMapChange: true
components:
  functions: false
  proxy: false
  pulsar_manager: false
  toolset: false
images:
  autorecovery:
    repository: apachepulsar/pulsar
  bookie:
    repository: apachepulsar/pulsar
  broker:
    repository: apachepulsar/pulsar
  zookeeper:
    repository: apachepulsar/pulsar
monitoring:
  alert_manager: false
  grafana: false
  node_exporter: false
  prometheus: false
pulsar_metadata:
  image:
    repository: apachepulsar/pulsar
zookeeper:
  replicaCount: 3
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
  restartPodsOnConfigMapChange: true
  volumes:
    data:
      name: data
      size: 2Gi
      storageClassName: standard-rwo

Robert Moucha

11/22/2022, 1:04 PM

yes, it looks much better

Robert Moucha

11/22/2022, 1:04 PM

ZK is being deployed and pulsar-proxy was removed

Robert Moucha

11/22/2022, 1:05 PM

output of helm get values is correct (no extra top-level key

pulsar_deployment_values

as before)

Reda Ouafi

11/22/2022, 1:06 PM

yes but still broker and bookie are not starting correctly

Robert Moucha

11/22/2022, 1:06 PM

they are waiting for ZK

Robert Moucha

11/22/2022, 1:06 PM

are all three ZK pods Running already?

Reda Ouafi

11/22/2022, 1:06 PM

sorry ZK is ?

Robert Moucha

11/22/2022, 1:06 PM

ZooKeeper

Reda Ouafi

11/22/2022, 1:07 PM

3 are running

Reda Ouafi

11/22/2022, 1:07 PM

upon 4

Robert Moucha

11/22/2022, 1:07 PM

no, there should be just 3 (0,1,and 2).

Reda Ouafi

11/22/2022, 1:08 PM

2 upon 3 sorry

Robert Moucha

11/22/2022, 1:08 PM

but the pod 0 is pending

Robert Moucha

11/22/2022, 1:08 PM

please check what is missing

Robert Moucha

11/22/2022, 1:08 PM

how many worker nodes the cluster has?

Reda Ouafi

11/22/2022, 1:08 PM

I am getting this if I try to see logs in K9S :

Copy code

│ Stream closed EOF for pulsar/pulsar-zookeeper-0 (pulsar-zookeeper)                                                                                                                                       │

Reda Ouafi

11/22/2022, 1:09 PM

my cluster has 6 nodes

Robert Moucha

11/22/2022, 1:09 PM

ok. that's enough, at least 3 are necessary to fulfill antiaffinity rules

Reda Ouafi

11/22/2022, 1:09 PM

the machine type is : e2-medium in GCP

Robert Moucha

11/22/2022, 1:10 PM

you can't check logs of pending pod (the pod was not yet started)

Robert Moucha

11/22/2022, 1:10 PM

you need to use kubectl describe pod ...

Robert Moucha

11/22/2022, 1:10 PM

there will be some events suggesting what's missing

Robert Moucha

11/22/2022, 1:12 PM

6x e2-medium, it's 12vCPU and 24GiB RAM. Is there some other workload running?

Reda Ouafi

11/22/2022, 1:12 PM

Reda Ouafi

11/22/2022, 1:12 PM

it's a dedicated cluster

Robert Moucha

11/22/2022, 1:12 PM

Reda Ouafi

11/22/2022, 1:12 PM

trying to do describe pod

Robert Moucha

11/22/2022, 1:13 PM

check the Events section in describe output

Reda Ouafi

11/22/2022, 1:15 PM

Copy code

Name:             pulsar-zookeeper-0                                                                                                                                                                     │
│ Namespace:        pulsar                                                                                                                                                                                 │
│ Priority:         0                                                                                                                                                                                      │
│ Service Account:  default                                                                                                                                                                                │
│ Node:             <none>                                                                                                                                                                                 │
│ Labels:           app=pulsar                                                                                                                                                                             │
│                   cluster=pulsar                                                                                                                                                                         │
│                   component=zookeeper                                                                                                                                                                    │
│                   controller-revision-hash=pulsar-zookeeper-7cfdc7b554                                                                                                                                   │
│                   release=pulsar                                                                                                                                                                         │
│                   <http://statefulset.kubernetes.io/pod-name=pulsar-zookeeper-0|statefulset.kubernetes.io/pod-name=pulsar-zookeeper-0>                                                                                                                                  │
│ Annotations:      checksum/config: 2f96bb4cb7a177c1b09ec8d95344e799b76b7343e693b05301273d957a62f67f                                                                                                      │
│                   <http://prometheus.io/port|prometheus.io/port>: 8000                                                                                                                                                               │
│                   <http://prometheus.io/scrape|prometheus.io/scrape>: true                                                                                                                                                             │
│ Status:           Pending                                                                                                                                                                                │
│ IP:                                                                                                                                                                                                      │
│ IPs:              <none>                                                                                                                                                                                 │
│ Controlled By:    StatefulSet/pulsar-zookeeper                                                                                                                                                           │
│ Containers:                                                                                                                                                                                              │
│   pulsar-zookeeper:                                                                                                                                                                                      │
│     Image:       apachepulsar/pulsar:2.9.2                                                                                                                                                               │
│     Ports:       8000/TCP, 2181/TCP, 2888/TCP, 3888/TCP                                                                                                                                                  │
│     Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP                                                                                                                                                              │
│     Command:                                                                                                                                                                                             │
│       sh                                                                                                                                                                                                 │
│       -c                                                                                                                                                                                                 │
│     Args:                                                                                                                                                                                                │
│       bin/apply-config-from-env.py conf/zookeeper.conf;                                                                                                                                                  │
│       bin/generate-zookeeper-config.sh conf/zookeeper.conf; OPTS="${OPTS} -Dlog4j2.formatMsgNoLookups=true" exec bin/pulsar zookeeper;                                                                   │
│                                                                                                                                                                                                          │
│     Limits:                                                                                                                                                                                              │
│       cpu:     500m                                                                                                                                                                                      │
│       memory:  512Mi                                                                                                                                                                                     │
│     Requests:                                                                                                                                                                                            │
│       cpu:      100m                                                                                                                                                                                     │
│       memory:   256Mi                                                                                                                                                                                    │
│     Liveness:   exec [timeout 30 bash -c echo ruok | nc -q 1 localhost 2181 | grep imok] delay=20s timeout=30s period=30s #success=1 #failure=10                                                         │
│     Readiness:  exec [timeout 30 bash -c echo ruok | nc -q 1 localhost 2181 | grep imok] delay=20s timeout=30s period=30s #success=1 #failure=10                                                         │
│     Environment Variables from:                                                                                                                                                                          │
│       pulsar-zookeeper  ConfigMap  Optional: false                                                                                                                                                       │
│     Environment:

Reda Ouafi

11/22/2022, 1:15 PM

Copy code

ZOOKEEPER_SERVERS:  pulsar-zookeeper-0,pulsar-zookeeper-1,pulsar-zookeeper-2                                                                                                                       │
│     Mounts:                                                                                                                                                                                              │
│       /pulsar/data from pulsar-zookeeper-data (rw)                                                                                                                                                       │
│       /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9cm6l (ro)                                                                                                                      │
│ Conditions:                                                                                                                                                                                              │
│   Type           Status                                                                                                                                                                                  │
│   PodScheduled   False                                                                                                                                                                                   │
│ Volumes:                                                                                                                                                                                                 │
│   pulsar-zookeeper-data:                                                                                                                                                                                 │
│     Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)                                                                                                     │
│     ClaimName:  pulsar-zookeeper-data-pulsar-zookeeper-0                                                                                                                                                 │
│     ReadOnly:   false                                                                                                                                                                                    │
│   kube-api-access-9cm6l:                                                                                                                                                                                 │
│     Type:                    Projected (a volume that contains injected data from multiple sources)                                                                                                      │
│     TokenExpirationSeconds:  3607                                                                                                                                                                        │
│     ConfigMapName:           kube-root-ca.crt                                                                                                                                                            │
│     ConfigMapOptional:       <nil>                                                                                                                                                                       │
│     DownwardAPI:             true                                                                                                                                                                        │
│ QoS Class:                   Burstable                                                                                                                                                                   │
│ Node-Selectors:              <none>                                                                                                                                                                      │
│ Tolerations:                 <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s                                                                                                                   │
│                              <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s                                                                                                                 │
│ Events:                                                                                                                                                                                                  │
│   Type     Reason             Age                  From                Message                                                                                                                           │
│   ----     ------             ----                 ----                -------                                                                                                                           │
│   Warning  FailedScheduling   25s (x83 over 95m)   default-scheduler   0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims.                                                      │
│   Normal   NotTriggerScaleUp  15s (x572 over 95m)  cluster-autoscaler  pod didn't trigger scale-up:

Reda Ouafi

11/22/2022, 1:15 PM

in 2 parts sorry, too long message

Robert Moucha

11/22/2022, 1:16 PM

np, only events below are important

Robert Moucha

11/22/2022, 1:16 PM

Copy code

0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims

Reda Ouafi

11/22/2022, 1:18 PM

should I change storageClassName

Reda Ouafi

11/22/2022, 1:18 PM

and see if it's fixed

Robert Moucha

11/22/2022, 1:19 PM

please, show me the output of

kubectl get sc

first

Reda Ouafi

11/22/2022, 1:19 PM

Copy code

NAME                 PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
premium-rwo          <http://pd.csi.storage.gke.io|pd.csi.storage.gke.io>   Delete          WaitForFirstConsumer   true                   2d1h
standard (default)   <http://kubernetes.io/gce-pd|kubernetes.io/gce-pd>    Delete          Immediate              true                   2d1h
standard-rwo         <http://pd.csi.storage.gke.io|pd.csi.storage.gke.io>   Delete          WaitForFirstConsumer   true                   2d1h

Robert Moucha

11/22/2022, 1:20 PM

I see, so these are probably some remainders from previous attempts. There are pending PVCs with wrong storage class (default)

Robert Moucha

11/22/2022, 1:21 PM

uninstall the helm release, delete the pulsar namespace

Reda Ouafi

11/22/2022, 1:21 PM

going to run a unistll

Reda Ouafi

11/22/2022, 1:21 PM

helm uninstall pulsar -n pulsar

Reda Ouafi

11/22/2022, 1:21 PM

😄 same time

Robert Moucha

11/22/2022, 1:22 PM

then check if there are some persistent volumes remaining related to pulsar. If yes, delete them as well

Reda Ouafi

11/22/2022, 1:22 PM

no more sts ➜ ansible git:(main) ✗ kubectl -n pulsar get sts No resources found in pulsar namespace.

Robert Moucha

11/22/2022, 1:22 PM

let's start from scratch, if you already have working ansible

Reda Ouafi

11/22/2022, 1:28 PM

clean done, waiting for pods to start now

Robert Moucha

11/22/2022, 1:28 PM

Pulsar namespace workload should finally look like this:

Copy code

NAME↑               PF READY RESTARTS STATUS   CPU MEM %CPU/R %CPU/L %MEM/R %MEM/L
pulsar-bookie-0     ●  1/1          0 Running    1 350      0    n/a     68    n/a
pulsar-bookie-1     ●  1/1          0 Running    1 351      0    n/a     68    n/a
pulsar-bookie-2     ●  1/1          0 Running    1 351      0    n/a     68    n/a
pulsar-broker-0     ●  1/1          0 Running   10 394      5    n/a     77    n/a
pulsar-broker-1     ●  1/1          0 Running   18 463      9    n/a     90    n/a
pulsar-recovery-0   ●  1/1          0 Running    1 205      2    n/a    320    n/a
pulsar-zookeeper-0  ●  1/1          0 Running    2 210      2    n/a     82    n/a
pulsar-zookeeper-1  ●  1/1          0 Running    5 201      5    n/a     78    n/a
pulsar-zookeeper-2  ●  1/1          0 Running    2 199      2    n/a     78    n/a

3 BookKeeper servers, 2 brokers, 3 ZK servers and one recovery pod.

Robert Moucha

11/22/2022, 1:29 PM

I'm going to 🚬 , it will take some time. I'll be right back. 😉

Robert Moucha

11/22/2022, 1:35 PM

I'm back. Any progress since then?

Reda Ouafi

11/22/2022, 1:40 PM

image.png

Reda Ouafi

11/22/2022, 1:40 PM

🤩

Reda Ouafi

11/22/2022, 1:40 PM

all running

Robert Moucha

11/22/2022, 1:40 PM

great job 🎉

Reda Ouafi

11/22/2022, 1:40 PM

now, good data

Reda Ouafi

11/22/2022, 1:40 PM

hope it will be easier than pulsar 😉

Robert Moucha

11/22/2022, 1:41 PM

just be careful with the yaml values files

Reda Ouafi

11/22/2022, 1:45 PM

All good data pods are running

2 Views

Open in Slack

Previous Next