Release Notes - GoodData.CN

GoodData.CN 2.1.0

Related products: GoodData.CN
GoodData.CN 2.1.0

Released September 8th, 2022.

New Features

Customizable Appearance

You can now customize the color schemes of your GoodData.CN deployment using the newly added Appearance settings. You can create custom themes for your dashboard and analytical designer, as well as color palettes for your visualization.

The color schemes are applied to the organization as a whole by default, but you can use different themes for different workspaces via API.

Please note that this feature is currently not available in GoodData.CN Community Edition.

RN_Ptheming.png

Learn more:
Customize Appearance

Support for Filtering of Empty Attribute Values

In GoodData.CN, attribute filters now support the empty (NULL) values.

You can now include or exclude these empty values from your filters in MAQL definitions or directly in Analytical Designer or Dashboards.

RN_nullvalues2.png

Learn more:
Filter Visualizations by Attributes and Dates
Relational Operators

Support for Time Zones

You can now configure time zones for your whole organization, or individual workspaces or users via API.

If not specifically configured, users inherit time zone settings from their workspace; workspaces inherit the time zone settings from their parent workspace or organization.

With correct time zones, your users can always see relevant data when they filter to, for example, Today or Last hour.

Notes:

  • If you use the TIMESTAMPTZ data type and you are a current GoodData.CN user, scan your data sources again to detect the columns with the time zone information.
  • We recommend configuring the time zone for the organization or workspace, so that you always see relevant data (the default is UTC)

Learn more:
Manage Time Zones

Order of Items in Stacked Charts

To improve the readability of charts with stacked items, we have reversed the order of the items displayed.

In bar charts, column charts, and stacked area charts, the order of the items now corresponds with the left-to-right order in the legend.

RN_reversedstacking2.png

New Entitlements API

We have expanded the previously introduced API endpoint /actions/resolveEntitlements.

Organization administrators can now get information about the current number of used workspaces and users from in license.

We have also introduced a new API endpoint /actions/collectUsage that displays how many workspaces and users currently exist in your organization. 

Learn more:
View Entitlements
View Usage

New API Filter Entity

You can now filter workspace entities using the origin=ALL|PARENTS|NATIVE filter to get only the workspace itself, or only its parent workspaces, or both. 

Learn more:
Origin

Configurable CSP For Organizations

Control hostname restrictions for your GoodData deployment with Content Security Policy (CSP). Your organization's CSP directives can be configured using the new API endpoint /entities/cspDirectives.

Learn more:
Enable CSP for an Organization

End of Beta For Permissions 

After months of testing we now consider permissions to be a fully tested feature and no longer in beta. Our thanks goes to all the early adopters.

Learn more:
Manage Permissions

Fixes and Updates

  • You can now set a custom Timeout duration when connecting to Google BigQuery, allowing you to increase the duration past the 10 second default.
  • We fixed an issue where queries that return large amounts of data, on the order of millions of rows, would fail due to cache memory issues.
  • GoodData.CN now uses Redshift driver version 2.1.0.9 and Snowflake driver version 3.13.21.

  • GoodData.CN now uses Apache Pulsar 2.9.3. Please note that this will require you to update customized-values-pulsar.yaml when upgrading to GoodData.CN 2.1, for more information, see Upgrade GoodData.CN to 2.1 below.

Get the Community Edition

Pull the GoodData.CN Community Edition to get started with the latest release:

docker pull gooddata/gooddata-cn-ce:2.1.0

Upgrade Guides

Upgrade GoodData.CN Community Edition to 2.1

Suppose you are using a docker volume to store metadata from your GoodData.CN CE container. Download a new version of the GoodData.CN CE docker image and start it with your volume. All your metadata is migrated automatically.

Upgrade GoodData.CN to 2.1

Preload updated Custom Resource Definition (CRD) for Organization resources. Due to the helm command limitation, it is impossible to update CRD automatically by Helm. Therefore it is necessary to update this CRD manually by following this procedure:

  1. Download and extract gooddata-cn helm chart to an empty directory on local disk.

    helm pull gooddata/gooddata-cn --version 2.1.0 --untar
  2. Update modified CRD in the cluster where the older GoodData.CN release is deployed:

    kubectl apply -f gooddata-cn/crds/organization-crd.yaml
  3. You can clean local extracted helm chart

    rm-r gooddata-cn
  4. The new version of Apache Pulsar in GoodData.CN 2.1 requires you update the customized-values-pulsar.yaml. The following two lines need to be added to the Pulsar broker configuration:

    systemTopicEnabled: "true" 
    topicLevelPoliciesEnabled: "true"

    Note that you may need to restart the broker pod after making this change, or just uncomment the restartPodsOnConfigMapChange: true line.

    See Use Customized values.yaml for Pulsar that contains the updated version of customized-values-pulsar.yaml with these changes already integrated.

  5. If you are using embedded PostgreSQL database, refer to the following Upgrade Postgresql-ha Helm Chart section for instructions on how to upgrade it.
  6. Perform a rescan of your data source with replace mode, to create an up-to-date version of the physical data model and to avoid any unwanted errors when regenerating the logical data model.

Upgrade Postgresql-ha Helm Chart

If you're using external PostgreSQL database (when you deployed gooddata-cn helm chart with option deployPostgresHA: false), you can skip this step because it is not relevant for your deployment.

If you're using embedded PostgreSQL database (with deployPostgresHA: true set in values.yaml), you will need to perform the following process to upgrade postgresql-ha helm chart from version 8.6.13 to 9.1.5. The upgrade includes migration of PostgreSQL database version from 11 to 14. Due to the nature of the upgrade, this action will cause a service disruption. Please schedule maintenance window for this operation.

Important Notes:

  • All commands must end successfully. Any error must be addressed properly to avoid data loss. Get familiar with the procedure on test environment before applying it to production.
  • Always start the upgrade in an empty folder. Procedure creates helper files for rollback purposes. If two upgrade procedures are executed from the same folder, helper files from two environments mix up and rollback will not be possible without proper manual intervention.
  • Procedure expects kubectl in version 1.23 or above because of parameter --retries of cp subcommand. It is possible to execute the procedure with older version. Just remove --retries and double-check complete data were transferred.

Steps:

  1. If you have not done so already, create a /tmp/organization-layout.json JSON dump from your GoodData.CN 2.0.x organization layout. See Back Up the Organization.

  2. Set up your shell environment, make sure you change the following values:

    # namespace GoodData.CN is deployed to
    export NAMESPACE=gooddata-cn
    # name of helm release used to deploy GoodData.CN
    export HELM_RELEASE=release-name
    # path to helm values file used to deploy GoodData.CN
    export HELM_VALUES_FILE="values-gooddata-cn.yaml"
    # postgres name as specified in values.yaml
    postgresql-ha.nameOverrideexport PG_NAME=db
    # PG-HA admin user as defined in values.yaml
    postgresql-ha.postgresql.usernameexport PGUSER=postgres
    # PG-HA admin user password as defined in values.yaml postgresql-ha.postgresql.password
    export PGPASSWORD=$(cat pg_password.txt)
    export PGHOST=${HELM_RELEASE}-${PG_NAME}-postgresql-0
    # helm release name of temporary PG in destination version started to execute pg_dump
    export TMP_PGDUMP_RELEASE=tmp-pg-dump
    export TMP_PGDUMP_POD=${TMP_PGDUMP_RELEASE}-postgresql-0
    # location of dumps in temporary container
    export DUMP_LOCATION=/bitnami/postgresql/dumps
  3. Disable access to your GoodData.CN application:

    helm upgrade --namespace$NAMESPACE--version 2.0.1 \
    --wait--timeout 7m -f$HELM_VALUES_FILE\
    --set metadataApi.replicaCount=0 \
    --set sqlExecutor.replicaCount=0 \
    --set dex.replicaCount=0 \
    $HELM_RELEASE gooddata/gooddata-cn

    Note that once the command finishes, users will see Internal Server Error message when trying to access any deployment organization.

  4. Deploy a temporary container to dump your data into:

    cat << EOT > /tmp/values-${TMP_PGDUMP_RELEASE}.yaml
    auth:
    postgresPassword: dumpdata
    primary:
    persistence:
    enabled: false
    readReplicas:
    replicaCount: 0
    EOT

    helm upgrade --install--namespace$NAMESPACE--version 11.6.6 \
    --wait--timeout 2m --values /tmp/values-${TMP_PGDUMP_RELEASE}.yaml \
    ${TMP_PGDUMP_RELEASE} bitnami/postgresql
  5. Enable network access between gooddata-cn-db-pgpool and tmp-pg-dump-postgresql:
    cat << EOT > /tmp/network-access-tmp-pg-2-prod-pg.yaml
    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
    namespace: $NAMESPACE
    name: $TMP_PGDUMP_RELEASE-ingress
    spec:
    ingress:
    - from:
    - namespaceSelector:
    matchLabels:
    kubernetes.io/metadata.name: $NAMESPACE
    podSelector:
    matchLabels:
    app.kubernetes.io/instance: $TMP_PGDUMP_RELEASE
    app.kubernetes.io/name: postgresql
    ports:
    - port: 5432
    protocol: TCP
    podSelector:
    matchLabels:
    app.kubernetes.io/component: postgresql
    app.kubernetes.io/instance: $NAMESPACE
    app.kubernetes.io/name: $PG_NAME
    policyTypes:
    - Ingress
    EOT

    kubectl apply -f /tmp/network-access-tmp-pg-2-prod-pg.yaml
  6. List available databases in the postgres-ha deployment:

    kubectl -n$NAMESPACEexec$PGHOST--env PGPASSWORD=$PGPASSWORD psql -U postgres -c"\l"

    Pick databases to be preserved. Use their names as input for variable USER_DBS_TO_TRANSFER in the next step. Note that:

    • databases mdexecution and dex are always preserved
    • databases template0 and template1 must be always skipped
  7. Dump databases that you want preserved, edit the USER_DBS_TO_TRANSFER and USER_ROLES_BY_PG_HA values:

    # space separate list of user DBs to be dumped, system DBs md, execution and dex are included automatically
    export USER_DBS_TO_TRANSFER="tigerdb"
    # space separated list of user-defined PG roles delivered by postgresql-ha helm chart;
    # the roles will be excluded from roles dump as they are created automatically during PG-HA ecosystem provisioning;
    # roles repmgr, postgres and executor are excluded automatically
    export USER_ROLES_BY_PG_HA=""

    cat << "EOT" > ./dump-pg-dbs.sh
    #!/bin/bash
    set -x
    set -e

    PGHOST_IP=$(kubectl get pod -n$NAMESPACE$PGHOST--template'{{.status.podIP}}')

    kubectl -n $NAMESPACE exec $TMP_PGDUMP_POD -- mkdir -p $DUMP_LOCATION
    # exclude all the roles created automatically by:
    # - PG-HA chart
    # - postgres installation
    # - GD chart
    time kubectl exec -it -n $NAMESPACE$TMP_PGDUMP_POD -- bash -c "env PGPASSWORD=$PGPASSWORD \
    pg_dumpall -h $PGHOST_IP -U $PGUSER --roles-only > /tmp/0_dump_pg_roles_all.sql"
    ROLES_TO_EXCLUDE="repmgr postgres executor"
    if [[ "$USER_ROLES_BY_PG_HA" != "" ]]; then
    ROLES_TO_EXCLUDE="${ROLES_TO_EXCLUDE}$USER_ROLES_BY_PG_HA"
    fi
    ROLES_TO_EXCLUDE=$(echo$ROLES_TO_EXCLUDE | tr' ''|')
    kubectl exec -n $NAMESPACE$TMP_PGDUMP_POD -- bash -c "grep -i -v \
    -E \"^(CREATE|ALTER) ROLE (${ROLES_TO_EXCLUDE})\"\
    /tmp/0_dump_pg_roles_all.sql | gzip > $DUMP_LOCATION/0_dump_pg_roles.sql.gz"

    # dump selected databases
    ALL_DBS_TO_TRANSFER="md execution dex ${USER_DBS_TO_TRANSFER}"
    ITER_ID=1
    for db in $ALL_DBS_TO_TRANSFER; do
    DUMP_DEST_FILE="$DUMP_LOCATION/${ITER_ID}_dump_pg_db_${db}.sql.gz"
    echo "Creating dump of DB ${db} to file ${DUMP_DEST_FILE}"
    time kubectl exec -it -n $NAMESPACE$TMP_PGDUMP_POD -- bash -c "env PGPASSWORD=$PGPASSWORD \
    pg_dump -h $PGHOST_IP -U $PGUSER --quote-all-identifiers --create ${db} | gzip > $DUMP_DEST_FILE"
    ITER_ID=$(($ITER_ID+1))
    done
    EOT

    chmod 754 dump-pg-dbs.sh
    ./dump-pg-dbs.sh

    The script first dumps only roles, those are PostgreSQL instance level definitions. After that, databases are dumped one by one.

    Before moving forward, verify that temporary POD contains all the requested dumps. It means:

    • 3 dumps (files) for mddex and execution databases
    • 1 dump for roles
    • 1 dump for each database specified in USER_DBS_TO_TRANSFER variable
    kubectl exec-n$NAMESPACE$TMP_PGDUMP_POD-- bash -c"ls -la $DUMP_LOCATION"
  8. (Optional) Download dumps to local machine for backup:

    Note: This step requires kubectl in version 1.23 or above. The version 1.23 adds parameter --retries

    kubectl exec-n$NAMESPACE$TMP_PGDUMP_POD-- bash -c"cd $DUMP_LOCATION/../; tar cf dumps.tar dumps/*"
    kubectl cp-n$NAMESPACE--retries-1$TMP_PGDUMP_POD:/bitnami/postgresql/dumps.tar ./dumps.tar

    # verify, data were transferred completely - compare MD5 hash
    kubectl exec-n$NAMESPACE$TMP_PGDUMP_POD--md5sum$DUMP_LOCATION/../dumps.tar
    md5sum ./dumps.tar
  9. Remove postgresql-ha:

    helm upgrade --namespace$NAMESPACE--version 2.0.1 \
    --wait--timeout 2m -f$HELM_VALUES_FILE\
    --set metadataApi.replicaCount=0 \
    --set sqlExecutor.replicaCount=0 \
    --set dex.replicaCount=0 \
    --setdeployPostgresHA=false\
    $HELM_RELEASE gooddata/gooddata-cn
  10. Remove persistent volume claims and backup related persistent volumes:

    cat << "EOT" > ./backup-pvs.sh
    #!/bin/bash
    set -x
    set -e

    PV_NAME_BACKUP_PREFIX=pv_name_backup
    PVC_BACKUP_PREFIX=pvc_bck
    PVC_PV_LINES=$(kubectl get pvc -n $NAMESPACE --sort-by=.metadata.name \
    --selector="app.kubernetes.io/component=postgresql,app.kubernetes.io/instance=${HELM_RELEASE},app.kubernetes.io/name=${PG_NAME}" \
    -o jsonpath='{range .items
  11. }{@.metadata.name}{" "}{@.spec.volumeName}{"\n"}{end}')

    echo "$PVC_PV_LINES" | while read -r line; do
    PVC_INSTANCE=$(echo $line | cut -f 1 -d ' ')
    PV_INSTANCE=$(echo $line | cut -f 2 -d ' ')

    # make sure PV is not deleted by k8s after PVC is removed
    echo "Setting Retain policy for PV=${PV_INSTANCE}"
    kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

    # remember PV name for rollback purposes
    PV_NAME_BACKUP_FILE="${PV_NAME_BACKUP_PREFIX}_${PV_INSTANCE}"
    echo "Creating PV name backup as $PV_NAME_BACKUP_FILE"
    touch $PV_NAME_BACKUP_FILE

    # Backup PVC definition for rollback purposes
    PVC_BCK_FILE="${PVC_BACKUP_PREFIX}_$PVC_INSTANCE.yaml"
    echo "Creating PVC backup file $PVC_BCK_FILE"
    kubectl get pvc -n $NAMESPACE $PVC_INSTANCE -o yaml > $PVC_BCK_FILE

    # delete PVC
    echo "Deleting PVC $PVC_INSTANCE"
    kubectl delete pvc -n $NAMESPACE $PVC_INSTANCE
    done
    EOT

    chmod 754 backup-pvs.sh
    ./backup-pvs.sh
  12. Upgrade chart to 2.1.0, this includes upgrade of postgresql-ha chart to 9.x.x:

    helm upgrade --namespace$NAMESPACE--version 2.1.0 \
    --wait--timeout 7m -f$HELM_VALUES_FILE\
    --set metadataApi.replicaCount=0 \
    --set sqlExecutor.replicaCount=0 \
    --set dex.replicaCount=0 \
    --setdeployPostgresHA=true\
    $HELM_RELEASE gooddata/gooddata-cn
  13. Verify postgres DB version:

    kubectl exec-n$NAMESPACE$PGHOST-c postgresql -- psql --version
  14. Enable network access between gooddata-cn-db-pgpool and tmp-pg-dump-postgresql:
    cat << "EOT" > ./restore-data.sh
    #!/bin/bash
    set -x
    set -e

    PGHOST_IP=$(kubectl get pod -n $NAMESPACE $PGHOST --template '{{.status.podIP}}')
    DUMP_FILES=$(kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- ls $DUMP_LOCATION)

    for dump_file in $DUMP_FILES; do
    echo "Restoring dump ${dump_file}"
    time kubectl exec -it -n $NAMESPACE $TMP_PGDUMP_POD -c postgresql -- bash -c "\
    gzip -cd $DUMP_LOCATION/$dump_file | \
    env PGPASSWORD=$PGPASSWORD psql -h $PGHOST_IP -U $PGUSER"
    done
    EOT

    chmod 754 restore-data.sh
    ./restore-data.sh
  15. Verify that all databases exist in a PostgreSQL instance:

    kubectl -n$NAMESPACEexec$PGHOST--env PGPASSWORD=$PGPASSWORD psql -U postgres -c"\l"
  16. Enable application:

     helm upgrade --namespace$NAMESPACE--version 2.1.0 \  
    --wait--timeout 7m -f$HELM_VALUES_FILE\
    $HELM_RELEASE gooddata/gooddata-cn
  17. Test the GoodData.CN deployment:

    • Is it possible to log in?
    • Does dashboard reports compute?
  18. Remove network policies:

    kubectl delete -f /tmp/network-access-tmp-pg-2-prod-pg.yaml
  19. Remove temporary PG deployed to create DB dumps:

    Warning: This operation drops all pg_dump backups. If step (8) was skipped backups will be lost.

    helm uninstall --namespace$NAMESPACE$TMP_PGDUMP_RELEASE

Rollback

In case something goes wrong, you can use persistent volumes backed in step 7 to revert back to version 2.1.0.

Steps:

  1. Disable access to the application and remove postgresql-ha:

    helm upgrade --namespace$NAMESPACE--version 2.1.0 \
    --wait--timeout 2m -f$HELM_VALUES_FILE\
    --set metadataApi.replicaCount=0 \
    --set sqlExecutor.replicaCount=0 \
    --set dex.replicaCount=0 \
    --setdeployPostgresHA=false\
    $HELM_RELEASE gooddata/gooddata-cn
  2. Remove PVCs and backup new PVs:

    cat << "EOT" > ./remove-new-pvc.sh
    #!/bin/bash
    set -x
    set -e

    PV_NAME_BACKUP_PREFIX=pv14_name_backup
    PVC_BACKUP_PREFIX=pvc14_bck
    PVC_PV_LINES=$(kubectl get pvc -n $NAMESPACE --sort-by=.metadata.name \
    --selector="app.kubernetes.io/component=postgresql,app.kubernetes.io/instance=${HELM_RELEASE},app.kubernetes.io/name=${PG_NAME}" \
    -o jsonpath='{range .items
  3. }{@.metadata.name}{" "}{@.spec.volumeName}{"\n"}{end}')

    echo "$PVC_PV_LINES" | while read -r line; do
    PVC_INSTANCE=$(echo $line | cut -f 1 -d ' ')
    PV_INSTANCE=$(echo $line | cut -f 2 -d ' ')

    # make sure PV is not deleted by k8s after PVC is removed
    echo "Setting Retain policy for PV=${PV_INSTANCE}"
    kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
    PV_NAME_BACKUP_FILE="${PV_NAME_BACKUP_PREFIX}_${PV_INSTANCE}"

    # remember PV name for possible investigation purposes
    echo "Creating PV name backup as $PV_NAME_BACKUP_FILE"
    touch $PV_NAME_BACKUP_FILE

    # Backup PVC definition for rollback purposes
    PVC_BCK_FILE="${PVC_BACKUP_PREFIX}_$PVC_INSTANCE.yaml"
    echo "Creating PVC backup file $PVC_BCK_FILE"
    kubectl get pvc -n $NAMESPACE $PVC_INSTANCE -o yaml > $PVC_BCK_FILE

    # delete PVC
    echo "Deleting PVC $PVC_INSTANCE"
    kubectl delete pvc -n $NAMESPACE $PVC_INSTANCE
    done
    EOT

    chmod 754 remove-new-pvc.sh
    ./remove-new-pvc.sh
  4. Restore original PVCs and bind them to PV backups:

     cat << "EOT" > ./restore-orig-pvc.sh
    #!/bin/bash
    set -x
    set -e

    PV_NAME_BACKUP_PREFIX=pv_name_backup
    PVC_BACKUP_PREFIX=pvc_bck

    # prepare original PVs to be joined by PVCs
    for bck_pv in $(ls ${PV_NAME_BACKUP_PREFIX}*); do
    PV_INSTANCE=${bck_pv#"${PV_NAME_BACKUP_PREFIX}_"}
    echo "Making PV ${PV_INSTANCE} available"
    kubectl patch pv "$PV_INSTANCE" --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]'
    echo "Setting Retain policy for PV=${PV_INSTANCE} to delete"
    kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'
    done

    # restore original PVCs
    for bck_pvc in $(ls ${PVC_BACKUP_PREFIX}*); do
     PVC_INSTANCE=${bck_pvc#"${PVC_BACKUP_PREFIX}_"}
    echo "Creating PVC ${PVC_INSTANCE}"
    kubectl create -f $bck_pvc
    done
    EOT

    chmod 754 restore-orig-pvc.sh
    ./restore-orig-pvc.sh
  5. Install application in the original version:

    helm upgrade --namespace $NAMESPACE --version 2.0.1 \
    --wait--timeout 2m -f$HELM_VALUES_FILE\
    --set metadataApi.replicaCount=0 \
    --set sqlExecutor.replicaCount=0 \
    --set dex.replicaCount=0 \
    --setdeployPostgresHA=true\
    $HELM_RELEASE gooddata/gooddata-cn
  6. Verify postgres DB version:

    kubectl exec-n$NAMESPACE$PGHOST-c postgresql -- psql --version
  7. Enable application:

    helm upgrade --namespace$NAMESPACE--version 2.0.1 \
    --wait--timeout 7m -f$HELM_VALUES_FILE\
    $HELM_RELEASE gooddata/gooddata-cn
  8. Test the GoodData.CN deployment after rollback:

    • Is it possible to log in?
    • Do visualizations on dashboard compute?