Yet another question is it actually possible to trigger a fo GoodData #gooddata-platform

Yet another question: is it actually possible to t...

Thomas Karbe

06/08/2024, 8:52 AM

Yet another question: is it actually possible to trigger a force full load for our incremental loads via API? Is there a help article about that? I’d like to automate that portion if possible

✅ 1

Michal Hauzírek

06/08/2024, 1:12 PM

Hi Thomas, based on your previous questions, I assume you are using ADDv2 from a database. It is actually possible to execute an existing schedule with different parameters via API and it is quite powerful. It is documented here. Just note that this API call is for executing any of the different processes (not just ADDv2) so its documentation might be a bit overwhelming. Most of the things you can override with the parameters that you send with the API call. A simple call like this

Copy code

POST /gdc/projects/SERVICE_WORKSPACE_ID/schedules/SCHEDULE_ID/executions

with payload:

Copy code

{
  "execution": {
    "params": {
      "GDC_DATALOAD_SINGLE_RUN_LOAD_MODE": "FULL"
    }
  }
}

Should cause all the datasets within the segment to be loaded with full load. You can be even more granular and use CUSTOM mode and define which datasets should be loaded and in what mode:

Copy code

{
  "execution": {
    "params": {
      "GDC_DATALOAD_SINGLE_RUN_LOAD_MODE": "CUSTOM",
      "GDC_DATALOAD_DATASETS": "[{\"dataset\":\"dataset.product\",\"uploadMode\":\"FULL\"},{\"dataset\":\"dataset.order\",\"uploadMode\":\"DEFAULT\"}]"
    }
  }
}

note that the value of GDC_DATALOAD_DATASETS is a string and it contains a JSON so the double quotes there need to be escaped. If you do not want to execute the load to the whole segment, you can use GDC_TARGET_PROJECTS and specify for which workspaces it should be executed:

Copy code

{
  "execution": {
    "params": {
      "GDC_DATALOAD_SINGLE_RUN_LOAD_MODE": "CUSTOM",
      "GDC_TARGET_PROJECTS": "mfrk2ybq0wjbqtokngb8rw39qkctnrnr,m4oufjgxx1jrm1z1vrddf454aybrsdsj",
      "GDC_DATALOAD_DATASETS": "[{\"dataset\":\"dataset.product\",\"uploadMode\":\"FULL\"},{\"dataset\":\"dataset.order\",\"uploadMode\":\"DEFAULT\"}]"
    }
  }
}

(note that in the GDC_TARGET_PROJECTS you need to use the workspace IDs (not LCM client_ids). An alternative option could be to just purge the data in your datasets as mentioned here - the next regular load should perform a full load. The disadvantage is obviously that until the load finishes, there are no data.

🙌 1

Thomas Karbe

06/08/2024, 1:16 PM

Sounds exactly like what I was looking for. We’re running our pipelines via airflow and this could just be the final step. No need to trigger on a schedule when it could simply trigger right after data became available

👌 2

14 Views

Open in Slack

Previous Next