Solved

Getting error during Init copy-extra-drivers

1 year ago
17 October 2022
6 replies
103 views

kentteo
New Participant
3 replies

Hi all, I am having trouble trying to set up my Gooddata.cn deployment on our GCP infrastructure. I successfully set up an instance (GD v2.0) in my other region a month ago. I am following the deployment guide but this time I am using GD.cn V2.1. I cannot figure out what went wrong.

This is the point I got to. Deployed Pulsar, Nginx and GoodData.cn. My Pod for gooddata-cn-sql-executor is not running due to CrashLoopBackOff. And I can see the error happened when trying to run the copy-extra-drivers container. I am using BigQuery in this case. I built my image using busybox and copy drivers by following the exact same guide that worked for me last time. I used kubectl to describe the log for this container and there isn’t much lead for me. Here it is.

Name:         gooddata-cn-sql-executor-5dffdbd85b-52vmt
Namespace:    gooddata-cn
Priority:     0
Node:         <MASKED>
Start Time:   Mon, 17 Oct 2022 15:06:17 +0800
Labels:       app.kubernetes.io/component=sqlExecutor
              app.kubernetes.io/instance=gooddata-cn
              app.kubernetes.io/name=gooddata-cn
              pod-template-hash=5dffdbd85b
Annotations:  prometheus.io/path: /actuator/prometheus
              prometheus.io/port: 9101
              prometheus.io/scrape: true
Status:       Pending
IP:           <MASKED>
IPs:
  IP:           <MASKED>
Controlled By:  ReplicaSet/gooddata-cn-sql-executor-5dffdbd85b
Init Containers:
  copy-extra-driver:
    Container ID:  containerd://fcb52b9bdf0170f233410430be82fa8680e28960a93b536b3541f232d7a52960
    Image:         <MASKED>/gooddata-cn-extra-drivers:latest
    Image ID:      <MASKED>/gooddata-cn-extra-drivers@sha256:ff8b4ef16e4242dda027f3c0c4cda937ab58a36bd10abd11498e7b9ca4473bfc
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      -r
      /data/.
      /app/extra-drivers/
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 17 Oct 2022 15:19:39 +0800
      Finished:     Mon, 17 Oct 2022 15:19:39 +0800
    Ready:          False
    Restart Count:  7
    Limits:
      cpu:                1500m
      ephemeral-storage:  300Mi
      memory:             900Mi
    Requests:
      cpu:                150m
      ephemeral-storage:  300Mi
      memory:             600Mi
    Environment:          <none>
    Mounts:
      /app/extra-drivers from drivers (rw)
  check-postgres-db:
    Container ID:  
    Image:         gooddata/tools:2.1.0
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
    Args:
      until pg_isready; do sleep 2; done; if [ "$(psql -Atq -c "select 1 from pg_database where datname = 'execution'")" != "1" ] ; then
        createdb execution ;
      fi ;

    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:                1500m
      ephemeral-storage:  300Mi
      memory:             900Mi
    Requests:
      cpu:                150m
      ephemeral-storage:  300Mi
      memory:             600Mi
    Environment:
      PGHOST:              <MASKED>
      PGPORT:              5432
      PGUSER:              postgres
      PGDATABASE:          postgres
      PGPASSWORD:          <set to the key 'postgresql-password' in secret 'gooddata-cn-postgres-password'>  Optional: false
      SQLEXEC_PGPASSWORD:  <set to the key 'postgresql-password' in secret 'gooddata-cn-postgres-password'>  Optional: false
    Mounts:                <none>
Containers:
  sql-executor:
    Container ID:   
    Image:          gooddata/sql-executor:2.1.0
    Image ID:       
    Ports:          6570/TCP, 9101/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:                1500m
      ephemeral-storage:  300Mi
      memory:             900Mi
    Requests:
      cpu:                150m
      ephemeral-storage:  300Mi
      memory:             600Mi
    Liveness:             http-get http://:9101/actuator/health/liveness delay=30s timeout=5s period=10s #success=1 #failure=5
    Readiness:            http-get http://:9101/actuator/health/readiness delay=30s timeout=10s period=10s #success=1 #failure=5
    Startup:              http-get http://:9101/actuator/health/liveness delay=30s timeout=5s period=10s #success=1 #failure=12
    Environment:
      JDK_JAVA_OPTIONS:                               -XX:+ExitOnOutOfMemoryError
      NODE_IP:                                         (v1:status.hostIP)
      POD_NAME:                                       gooddata-cn-sql-executor-5dffdbd85b-52vmt (v1:metadata.name)
      NAMESPACE:                                      gooddata-cn (v1:metadata.namespace)
      LOGGING_APPENDER:                               json
      SPRING_MAIN_BANNER_MODE:                        off
      SPRING_CONFIG_ADDITIONAL_LOCATION:              classpath:git.properties
      SPRING_ZIPKIN_ENABLED:                          false
      ZIPKIN_HOST:                                    jaeger-collector.monitoring
      ZIPKIN_PORT:                                    9411
      PULSAR_SERVICEURL:                              pulsar://pulsar-broker.pulsar:6650
      PULSAR_ADMINURL:                                http://pulsar-broker.pulsar:8080
      PULSAR_CONSUMERS_SELECT_TOPIC:                  gooddata-cn/gooddata-cn/sql.select
      PULSAR_CONSUMERS_SELECT_DEAD_LETTER_TOPIC:      gooddata-cn/gooddata-cn/sql.select.DLQ
      PULSAR_CONSUMERS_DATA_SOURCE_CHANGE_TOPIC:      gooddata-cn/gooddata-cn/data-source.change
      PULSAR_CONSUMERS_CACHES_GARBAGE_COLLECT_TOPIC:  gooddata-cn/gooddata-cn/caches.garbage-collect
      GRPC_RAWCACHE_HOST:                             gooddata-cn-result-cache-headless
      GRPC_RAWCACHE_PORT:                             6567
      GRPC_LICENSE_HOST:                              gooddata-cn-auth-service-headless
      GRPC_LICENSE_PORT:                              6573
      GRPC_DATASOURCE_HOST:                           gooddata-cn-metadata-api-headless
      GRPC_DATASOURCE_PORT:                           6572
      SPRING_DATASOURCE_URL:                          jdbc:postgresql://<MASKED>:5432/execution
      BANNED_JDBC_URLS:                               jdbc:postgresql://<MASKED>:5432/dex
                                                      jdbc:postgresql://<MASKED>:5432/md?reWriteBatchedInserts=true
      SPRING_DATASOURCE_USERNAME:                     postgres
      SPRING_DATASOURCE_PASSWORD:                     <set to the key 'postgresql-password' in secret 'gooddata-cn-postgres-password'>  Optional: false
      LOG4J_ASYNC_LOGGER_RING_BUFFER_SIZE:            262144
      GDC_TELEMETRY_ENABLED:                          true
      GDC_TELEMETRY_SITE_ID:                          2
      LIMIT_MAX_RESULT_RAW_BYTES:                     100000000
      GRPC_SERVER_MAX_CONNECTION_AGE:                 300
      GRPC_SERVER_PERMIT_KEEP_ALIVE_TIME:             25
      GRPC_SERVER_PERMIT_KEEP_ALIVE_WITHOUT_CALLS:    true
    Mounts:
      /app/extra-drivers from drivers (rw)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  drivers:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
    SizeLimit:   <unset>
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  14m                  default-scheduler  Successfully assigned gooddata-cn/gooddata-cn-sql-executor-5dffdbd85b-52vmt to gke-<MASKED>-default-pool-dc2c3081-6dn5
  Normal   Pulled     13m                  kubelet            Successfully pulled image "<MASKED>/gd-cn/gooddata-cn-extra-drivers:latest" in 42.46204614s
  Normal   Pulled     12m                  kubelet            Successfully pulled image "<MASKED>/gd-cn/gooddata-cn-extra-drivers:latest" in 1m13.368103313s
  Normal   Pulled     12m                  kubelet            Successfully pulled image "<MASKED>/gd-cn/gooddata-cn-extra-drivers:latest" in 1.740437329s
  Normal   Created    11m (x4 over 13m)    kubelet            Created container copy-extra-driver
  Normal   Pulled     11m                  kubelet            Successfully pulled image "<MASKED>/gd-cn/gooddata-cn-extra-drivers:latest" in 1.619845639s
  Normal   Started    11m (x4 over 13m)    kubelet            Started container copy-extra-driver
  Normal   Pulled     10m                  kubelet            Successfully pulled image "<MASKED>/gd-cn/gooddata-cn-extra-drivers:latest" in 1.856558262s
  Normal   Pulling    9m6s (x6 over 14m)   kubelet            Pulling image "<MASKED>/gd-cn/gooddata-cn-extra-drivers:latest"
  Warning  BackOff    4m7s (x38 over 12m)  kubelet            Back-off restarting failed container

icon

Best answer by Robert Moucha 17 October 2022, 15:39

View original

6 replies

Userlevel 3

Jan Rehanek
GoodData Employee
41 replies
1 year ago
17 October 2022

Hello,

From the kubectl describe output it looks like copy-extra-driver init container is failing to do its job. Command cp -r /data/. /app/extra-drivers/ is failing.

Because target volume /app/extra-drivers/ seems to be mounted, I guess that /data/ does not exist in gooddata-cn-extra-drivers image.

Could you provide the output of the command below to confirm this? Command:

kubectl logs -n gooddata-cn gooddata-cn-sql-executor-5dffdbd85b-52vmt -c copy-extra-driver -p

kentteo
Author
New Participant
3 replies
1 year ago
17 October 2022

Hello Jan,

Thanks for replying. Here is the output after the command.

exec /bin/cp: exec format error

Regards

Userlevel 2

Robert Moucha
GoodData Employee
29 replies
1 year ago
17 October 2022
Answer

My wild guess would be that the image was built on another architecture than you are running kubernetes cluster.

Is there a chance that the image was made on amd64 arch while running on arm64 or vice versa?

Please check using:

docker image inspect <MASKED>/gooddata-cn-extra-drivers@sha256:ff8b4ef16e4242dda027f3c0c4cda937ab58a36bd10abd11498e7b9ca4473bfc

and look for "Architecture" attribute. This value must match nodeInfo.architecture of your Kubernetes nodes.

There are ways how to build multi-arch image, but the build process is more complex.

EDIT: Currently we support only amd64 builds because Apache Pulsar doesn’t offer arm64 images yet. So your kubernetes cluster must also run on amd64 CPU architecture and custom driver images must be built for amd64 arch as well.

kentteo
Author
New Participant
3 replies
1 year ago
18 October 2022

Hi Robert,

Good guess, yes I think that is very likely the issue as you described. I have built the image on an Apple Silicon CPU compared to an Intel CPU last round I did the set up. Good catch, thank you. I will test it out today.

Best

kentteo
Author
New Participant
3 replies
1 year ago
18 October 2022

Following up, I rebuilt my image using an Intel Machine but that didn’t solve the problem. So I inspected the docker images between the one I built last month vs this week and noticed the env and cmd section in my new image is null value. I not sure what is causing this. I managed to get my deployment working by reusing my old docker image. Resolved my issue for now, but I hope we can find a fix for this.

Userlevel 2

Robert Moucha
GoodData Employee
29 replies
1 year ago
20 October 2022

The contents of ContainerConfig makes no difference. It simply shows the configuration of the container used during the image build. The older image was created on regular docker environment, while the newer one was created with Buildkit feature turned on. You could achieve the same result when building the image the following way: DOCKER_BUILDKIT=0 docker build -t myimage .

The important thing is how the structure of the drivers looks like within the image itself. Every driver needs to be stored in extra subdirectory of /data , for example (the contents may differ depending on the drivers you have):

/data

├── BIGQUERY

│ ├── animal-sniffer-annotations-1.20.jar

│ ├── annotations-4.1.1.4.jar

...truncated...

│ ├── third-party-licenses.txt

│ └── threetenbp-1.5.2.jar

├── DREMIO

│ ├── dremio-jdbc-driver-3.0.6-201812082352540436-1f684f9.jar

│ ├── dremio-snowflake-plugin-20.1.0.jar

│ ├── dremio-snowflake-plugin.jar

│ └── dremio-verticaarp-plugin.jar

├── DRILL

│ └── drill-jdbc-all-1.18.0.jar

├── REDSHIFT

│ └── RedshiftJDBC42-no-awssdk-1.2.50.1077.jar

└── VERTICA

└── vertica-jdbc-10.0.1-2.jar

Please refer to https://www.gooddata.com/developers/cloud-native/doc/2.1/deploy-and-install/cloud-native/helm-chart-installation/#custom-jdbc-drivers for step-by-step example.

Reply

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded