Hello! My team and I have been experiencing a weir...
# gooddata-cn
a
Hello! My team and I have been experiencing a weird behaviour in GoodData.CN-CE. We use the docker image locally for development and test purposes and sometimes when attempting to start the service we get the following error, which prevents the container from starting:
Copy code
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
Prior to this log the startup seems normal but afterwards all services that had been started begin to shutdown. We’ve been able to solve it by creating a new data volume for the container, but this is cumbersome as it means we loose the entities we created previously and need to setup our environment from scratch. This also seems to happen rather randomly. My guess is that its related to some corruption in the volume due to a forced stop, but I would like to know if this has happened to someone else and, if so, if there is a way to overcome the problem without losing the existing data.
m
Hi Antonio, what GoodData.CN-CE version are you running? Have you already pulled the latest edition:
docker pull gooddata/gooddata-cn-ce:2.5.0
?
a
This happens both in 2.5.0 and 2.4.0, I haven’t tested other versions. The last time the container ran it was using 2.4.0
b
Hi Antonio, in the output log there should be mentioned which service exactly could not be started and possibly even some reason/error message
r
The important information about the service that actually failed to start were logged before the "s6-rc failed to properly bring all the services up" message. The may be various reasons for this failure and without the actual logs it is difficult to guess. S6 supervisor is very persevering to start all services so this error indicates serious problem, possibly corrupted data in volume, not enough disk space or something like this.
a
The logs are very long and it’s hard to isolate messages from the different services, but I was able to find the following message also from
s6-rc
that reports a warning:
s6-rc: warning: unable to start service bootstrap-org: command exited 77
. If you care to read the full log, you can also find it here:
This seems to be the critical part, lines 1398-1496:
Copy code
2023-09-07 15:27:15 s6-rc: info: service bootstrap-org: starting
2023-09-07 15:27:15 s6-rc: info: service sql-executor: starting
2023-09-07 15:27:15 s6-rc: info: service calcique: starting
2023-09-07 15:27:15 + : 80
2023-09-07 15:27:15 + : data-analytics.local.promptly.health
2023-09-07 15:27:15 + H_JSON='Content-type: application/json'
2023-09-07 15:27:15 + H_GDC_JSON='Content-type: application/vnd.gooddata.api+json'
2023-09-07 15:27:15 + FAKE_HOST='X-Forwarded-Host: data-analytics.local.promptly.health:80'
2023-09-07 15:27:15 + DEFINITION_FILE=
2023-09-07 15:27:15 + METADATA_HOST=localhost
2023-09-07 15:27:15 + METADATA_ACTUATOR_PORT=9008
2023-09-07 15:27:15 + METADATA_GRPC_PORT=6572
2023-09-07 15:27:15 + METADATA_HTTP_PORT=9007
2023-09-07 15:27:15 + AUTH_SERVICE_HOST=localhost
2023-09-07 15:27:15 + AUTH_SERVICE_ACTUATOR_PORT=9051
2023-09-07 15:27:15 + AUTH_SERVICE_HTTP_PORT=9050
2023-09-07 15:27:15 + GRPC_PASSWORD=SuperSecretPassword
2023-09-07 15:27:15 + TOKEN_SECRET=admin123
2023-09-07 15:27:15 + getopts f:m:p:g:j:a:q:r:w:U:P:T: o
2023-09-07 15:27:15 + case "${o}" in
2023-09-07 15:27:15 + DEFINITION_FILE=/opt/bootstrap/bootstrap.json
2023-09-07 15:27:15 + getopts f:m:p:g:j:a:q:r:w:U:P:T: o
2023-09-07 15:27:15 + case "${o}" in
2023-09-07 15:27:15 + USERNAME=demo@example.com
2023-09-07 15:27:15 + getopts f:m:p:g:j:a:q:r:w:U:P:T: o
2023-09-07 15:27:15 + case "${o}" in
2023-09-07 15:27:15 + PASSWORD=demo123
2023-09-07 15:27:15 + getopts f:m:p:g:j:a:q:r:w:U:P:T: o
2023-09-07 15:27:15 + case "${o}" in
2023-09-07 15:27:15 + TOKEN_SECRET=admin123
2023-09-07 15:27:15 + getopts f:m:p:g:j:a:q:r:w:U:P:T: o
2023-09-07 15:27:15 + shift 8
2023-09-07 15:27:15 + '[' -z /opt/bootstrap/bootstrap.json ']'
2023-09-07 15:27:15 + curl --output /dev/null -fs <http://localhost:9008/actuator/health/readiness>
2023-09-07 15:27:15 ++ base64
2023-09-07 15:27:15 ++ tr -d '\n'
2023-09-07 15:27:15 ++ printf bootstrap:%s SuperSecretPassword
2023-09-07 15:27:15 + AUTHORIZATION=Ym9vdHN0cmFwOlN1cGVyU2VjcmV0UGFzc3dvcmQ=
2023-09-07 15:27:15 ++ echo -n admin:bootstrap:admin123
2023-09-07 15:27:15 ++ base64
2023-09-07 15:27:15 + BOOTSTRAP_TOKEN=YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz
2023-09-07 15:27:15 ++ python3 -c 'import crypt; print(crypt.crypt('\''admin123'\'', crypt.mksalt(crypt.METHOD_SHA256)))'
2023-09-07 15:27:15 <string>:1: DeprecationWarning: 'crypt' is deprecated and slated for removal in Python 3.13
2023-09-07 15:27:15 + TOKEN_HASH='$5$FsRaPODp7gu692nI$Z318owfMB1Az5CvGAThQTm3TkZC9nNwOx21/eyMqNp5'
2023-09-07 15:27:15 + jq '.organizations[0].hostname="data-analytics.local.promptly.health" |
2023-09-07 15:27:15     .organizations[0].initialToken.cryptedToken="$5$FsRaPODp7gu692nI$Z318owfMB1Az5CvGAThQTm3TkZC9nNwOx21/eyMqNp5"' /opt/bootstrap/bootstrap.json
2023-09-07 15:27:15 + grpcurl -H 'Authorization: Basic Ym9vdHN0cmFwOlN1cGVyU2VjcmV0UGFzc3dvcmQ=' -plaintext -d @ localhost:6572 tiger.MetadataStorePrivilegedService/ensureOrganizationsExists
2023-09-07 15:27:15 ts="2023-09-07 14:27:15.628" level=ERROR msg="gRPC server call" logger=com.gooddata.tiger.metadata.grpc.MetadataStorePrivilegedGrpcService thread=DefaultDispatcher-worker-3 action=grpcServerCall spanId=5b096f8b0e245155 traceId=5b096f8b0e245155 exc="com.gooddata.data.stock.StockEntityNotExistsException: Some of given referenced 'userGroups' entities do not exist.Not existing IDs: 
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.PersistBuilderSupport.checkAllReferencedIdsExist(PersistBuilderSupport.kt:494)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.PersistBuilderSupport.selectRelatedEntityIds(PersistBuilderSupport.kt:474)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.PersistBuilderSupport.insertManyToManyRelationshipInner(PersistBuilderSupport.kt:443)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.PersistBuilderSupport.insertManyToManyRelationship(PersistBuilderSupport.kt:408)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.UpdateQueryBuilder$updateExistingEntity$7$1.invoke(UpdateQueryBuilder.kt:78)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.UpdateQueryBuilder$updateExistingEntity$7$1.invoke(UpdateQueryBuilder.kt:78)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.UpdateQueryBuilder.updateManyToManyTable(UpdateQueryBuilder.kt:181)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.UpdateQueryBuilder.updateExistingEntity(UpdateQueryBuilder.kt:78)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.UpdateQueryBuilder.buildUpdate(UpdateQueryBuilder.kt:57)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.ExposedApiObjectRepository.update(ExposedApiObjectRepository.kt:195)
2023-09-07 15:27:15 at com.gooddata.data.persistence.exposed.ExposedApiObjectRepository$$FastClassBySpringCGLIB$$7acde1ca.invoke(<generated>)
2023-09-07 15:27:15 at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
2023-09-07 15:27:15 at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793)
2023-09-07 15:27:15 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
2023-09-07 15:27:15 at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
2023-09-07 15:27:15 at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:137)
2023-09-07 15:27:15 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
2023-09-07 15:27:15 at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
2023-09-07 15:27:15 at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:123)
2023-09-07 15:27:15 at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:388)
2023-09-07 15:27:15 at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
2023-09-07 15:27:15 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
2023-09-07 15:27:15 at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
2023-09-07 15:27:15 at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:708)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.repository.exposed.organization.ExposedUserRepository$$EnhancerBySpringCGLIB$$601ef0a.update(<generated>)
2023-09-07 15:27:15 at com.gooddata.data.stock.BasicApiStock$update$2.invoke(BasicApiStock.kt:129)
2023-09-07 15:27:15 at com.gooddata.data.stock.BasicApiStock$update$2.invoke(BasicApiStock.kt:119)
2023-09-07 15:27:15 at com.gooddata.data.stock.BasicApiStock.withDispatch(BasicApiStock.kt:225)
2023-09-07 15:27:15 at com.gooddata.data.stock.BasicApiStock.update$suspendImpl(BasicApiStock.kt:119)
2023-09-07 15:27:15 at com.gooddata.data.stock.BasicApiStock.update(BasicApiStock.kt)
2023-09-07 15:27:15 at com.gooddata.data.stock.ApiStock$DefaultImpls.update$default(ApiStock.kt:144)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.service.OrganizationBootstrapService.bootstrapUser(OrganizationBootstrapService.kt:224)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.service.OrganizationBootstrapService.access$bootstrapUser(OrganizationBootstrapService.kt:41)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.service.OrganizationBootstrapService$bootstrap$2.invokeSuspend(OrganizationBootstrapService.kt:57)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.service.OrganizationBootstrapService$bootstrap$2.invoke(OrganizationBootstrapService.kt)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.service.OrganizationBootstrapService$bootstrap$2.invoke(OrganizationBootstrapService.kt)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.transaction.SuspendedTransactionKt$coTransaction$2$1.invokeSuspend(SuspendedTransaction.kt:31)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.transaction.SuspendedTransactionKt$coTransaction$2$1.invoke(SuspendedTransaction.kt)
2023-09-07 15:27:15 at com.gooddata.tiger.metadata.transaction.SuspendedTransactionKt$coTransaction$2$1.invoke(SuspendedTransaction.kt)
2023-09-07 15:27:15 at org.jetbrains.exposed.sql.transactions.experimental.SuspendedKt$suspendedTransactionAsyncInternal$1.invokeSuspend(Suspended.kt:130)
2023-09-07 15:27:15 at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
2023-09-07 15:27:15 at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
2023-09-07 15:27:15 at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
2023-09-07 15:27:15 at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:738)
2023-09-07 15:27:15 at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
2023-09-07 15:27:15 at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
2023-09-07 15:27:15 "
2023-09-07 15:27:15 ERROR:
2023-09-07 15:27:15   Code: Internal
2023-09-07 15:27:15   Message: com.gooddata.data.stock.StockEntityNotExistsException: Some of given referenced 'userGroups' entities do not exist.Not existing IDs: 
2023-09-07 15:27:15 s6-rc: warning: unable to start service bootstrap-org: command exited 77
b
yep, looks like
Copy code
2023-09-07 15:27:15 ERROR:
2023-09-07 15:27:15   Code: Internal
2023-09-07 15:27:15   Message: com.gooddata.data.stock.StockEntityNotExistsException: Some of given referenced 'userGroups' entities do not exist.Not existing IDs: 
2023-09-07 15:27:15 s6-rc: warning: unable to start service bootstrap-org: command exited 77
it's strange that it doesn't state ID, could it be that you are incorrectly referencing a userGroup with empty ID?
a
I don’t recall setting any empty userGroup, no. This also happens rather randomly once the service restarts. Is there any way I could inspect what user groups are referenced without the service being up?
b
@Robert Moucha any idea?
r
This error indicates corrupted logical structures of metadata database. The
tiger.MetadataStorePrivilegedService/ensureOrganizationsExists
gRPC call is responsible for initial creation and update of the
default
organization,
adminGroup
user group and
admin
user.
I tried to simulate locally with various use cases (like adding a new group into
userGroup
, adding
admin
into multiple groups), but I was unable to replicate your issue. If you still have the docker volume at hand, you may try to investigate and possibly fix the error. In order to get up the database running, you can do the following procedure:
Copy code
# start temporary interactive container with mounted volume
docker run -it -v YOUR-VOLUME:/data -u 102 --entrypoint bash --rm gooddata/gooddata-cn-ce:YOUR-VERSION-HERE

# inside container, fix pg data link to docker volume
rm -rf  /var/lib/postgresql/13/main && ln -nfs /data/postgresql /var/lib/postgresql/13/main

# inside container, start postgresql on background
/usr/lib/postgresql/13/bin/postgres -c config_file=/etc/postgresql/13/main/postgresql.conf &> /tmp/pg.log &

# inside container, connect to metadata database.
psql md

# Now you can query the database
md=# select api_id, organization, entity_id from "user" where api_id = 'admin';
 api_id | organization | entity_id 
--------+--------------+-----------
 admin  | default      |         1
(1 row)
Tables of interest are: •
user_group
(all user groups) •
user_group_user_group
(M:N relations between user groups) •
"user"
(all users, mind the double quotes, as user is postgres keyword) •
user_user_group
(group membership) •
api_token
(contains bootstrap token of
admin
user, but I don't expect a problem here) There are foreign keys that should hint you on the object relations.
1
a
Thank you very much! I will try these suggestions out and report back any outcomes.
@Robert Moucha Hello, apologies for the late feedback, but I was OOO for a few days. I followed your instructions and inspected the tables you suggested. To my eye, nothing abnormal seems to be happening. Here’s a log of the queries:
Copy code
postgres@8664e84deb8c:/opt$ psql md
psql (13.11 (Debian 13.11-0+deb11u1))
Type "help" for help.

md=# select api_id, organization, entity_id from "user" where api_id = 'admin';
 api_id | organization | entity_id 
--------+--------------+-----------
 admin  | default      |         1
(1 row)

md=# select * from "user_group";
                api_id                | organization | entity_id |     name     
--------------------------------------+--------------+-----------+--------------
 9b3d55b5-27bb-44c4-a5ea-1609b4d43017 | default      |         3 | End User
 f2c96228-527a-4252-8342-b666e943460f | default      |         4 | Manager
 adminGroup                           | default      |         1 | OurApp Admin ## We changed the name of the default adminGroup
(3 rows)

md=# select * from "user_group_user_group"; # No hierarchy of user groups was configured
 child_id_organization | child_id_api_id | child_id | parent_id_organization | parent_id_api_id | parent_id 
-----------------------+-----------------+----------+------------------------+------------------+-----------
(0 rows)

md=# select api_id, organization, entity_id from "user";
                          api_id                          | organization | entity_id 
----------------------------------------------------------+--------------+-----------
 admin                                                    | default      |         1
 demo                                                     | default      |         2
 d7afdb42-81a7-4317-92bc-e043961eea37                     | default      |         3
 gdc_demo_864caa58-57d8-4b67-90a5-e52ebf6c5e96_PublicUser | default      |         6
 a4809d66-e037-40d2-abc2-130b22ed0142                     | default      |         7
 a582060e-f2b3-4a7b-b86f-e393ca70ec70                     | default      |         8
(6 rows)

md=# select * from "user_user_group";
 user_id_organization |            user_id_api_id            | user_id | user_group_id_organization |         user_group_id_api_id         | user_group_id 
----------------------+--------------------------------------+---------+----------------------------+--------------------------------------+---------------
 default              | admin                                |       1 | default                    | adminGroup                           |             1
 default              | demo                                 |       2 | default                    | adminGroup                           |             1
 default              | d7afdb42-81a7-4317-92bc-e043961eea37 |       3 | default                    | adminGroup                           |             1
 default              | a4809d66-e037-40d2-abc2-130b22ed0142 |       7 | default                    | f2c96228-527a-4252-8342-b666e943460f |             4
 default              | a582060e-f2b3-4a7b-b86f-e393ca70ec70 |       8 | default                    | 9b3d55b5-27bb-44c4-a5ea-1609b4d43017 |             3
(5 rows)

md=# select api_id, organization, user, valid_to, internal from "api_token";
                             api_id                             | organization |   user   | valid_to | internal 
----------------------------------------------------------------+--------------+----------+----------+----------
 bootstrap                                                      | default      | postgres |          | f
 gdc_demo_864caa58-57d8-4b67-90a5-e52ebf6c5e96_PublicUser_token | default      | postgres |          | f
(2 rows)

md=# select api_id from "api_token";
                             api_id                             
----------------------------------------------------------------
 bootstrap
 gdc_demo_864caa58-57d8-4b67-90a5-e52ebf6c5e96_PublicUser_token
(2 rows)

md=# select api_id, entity_id, bootstrap_user_id_organization, bootstrap_user_id_api_id, bootstrap_user_id, bootstrap_user_group_id_organization, bootstrap_user_group_id_api_id from "organization";
 api_id  | entity_id | bootstrap_user_id_organization | bootstrap_user_id_api_id | bootstrap_user_id | bootstrap_user_group_id_organization | bootstrap_user_group_id_api_id 
---------+-----------+--------------------------------+--------------------------+-------------------+--------------------------------------+--------------------------------
 default |         1 | default                        | admin                    |                 1 | default                              | adminGroup
(1 row)
When running the container, I get the same error:
Copy code
2023-09-22 12:13:03 ERROR:
2023-09-22 12:13:03   Code: Internal
2023-09-22 12:13:03   Message: com.gooddata.data.stock.StockEntityNotExistsException: Some of given referenced 'userGroups' entities do not exist.Not existing IDs: 
2023-09-22 12:13:03 s6-rc: warning: unable to start service bootstrap-org: command exited 77
r
Hello António, I can see the bootstrap token (and the other as well) refer to non-existent user "postgres". If you update
api_token
table so that both rows refer to "admin" user:
Copy code
update api_token set user = "admin";
Then start the gooddata-cn-ce container with fixed volume as usual.
a
@Robert Moucha the user in that query refers to the postgres db user, when using “user”, the results are:
Copy code
md=# select api_id, organization, "user", valid_to, internal from "api_token";
           api_id           | organization | user  | valid_to | internal 
----------------------------+--------------+-------+----------+----------
 bootstrap                  | default      | admin |          | f
 workspace_PublicUser_token | default      | admin |          | f
(2 rows)
And the service still fails to start with the same error.
r
good point, I didn't noticed missing double quotes in your previous message. Good news is I managed to reproduce your issue.
I tracked down the code path based on the stack trace provided and the organization bootstrap service doesn't find your
adminGroup
user group properly. Since the only change you made was updating userGroup name, I tried to do so as well and reproduced the issue:
Copy code
ts="2023-09-28 14:53:33.303" level=ERROR msg="gRPC server call" logger=com.gooddata.tiger.metadata.grpc.MetadataStorePrivilegedGrpcService thread=DefaultDispatcher-worker-3 action=grpcServerCall spanId=b360b9ab566583b0 traceId=b360b9ab566583b0 exc="com.gooddata.data.stock.StockEntityNotExistsException: Some of given referenced 'userGroups' entities do not exist.Not existing IDs:
I will file internal bug report so the
name
attribute should be ignored in comparison. For now, please abstain from changing attributes of user "admin" and group "adminGroup". To fix the container, connect to DB as before and run:
update user_group set name = NULL where api_id = 'adminGroup';