Hi
@Shyam Ramani, let me try to help here. I believe there might be several different concepts mixed up here so let me try to explain it. I understand that it might be a bit confusing.
First the GoodData edition - there are two major editions which differ (among other things) in how they handle the data loads:
• GoodData Platform - in which you physically load the data to its internal database via API (or other tools)
•
GoodData.CN (Cloud Native) - in which the data is not being loaded anywhere and it works directly on top of the connected supported data source
Based on the screenshots and the API you mentioned, you seem to be using the
GoodData Platform. So the “
Data Source Notification” article for Cloud Native does not apply to you. In fact any article which is in the Cloud Native section does not apply to you.
So let’s talk about data loads to the GoodData Platform now.
The API you mentioned you are using (
https://help.gooddata.com/doc/enterprise/en/data-integration/data-preparation-and-dis[…]ion/additional-data-load-reference/loading-data-via-rest-api) is a low-level API to load data to a specific workspace and it does load the data either fully or incrementally for you, but it does not determine the increment. This low level API does not use the x__timestamp column. If you want to incrementally load only the data which has changed, you should send only the new/modified records in the CSV file.
From the
screenshot you shared, it seems you are now really loading incrementally (=upsert) but the data size uploaded remained the same 371.7 kB. So I assume you told the system to upload incrementally, but provided ALL the data (not just the increment). As I said this API is NOT using the x__timestamp column.
The x__timestamp column (or in case of flat files the naming convention of the file with timestamp in its name) is a feature of another tool for uploading data - the
Automated Data Distribution which is higher-level and internally uses the same API, but apart from it, it provides service for automatically handling the increments, load data to multiple workspaces etc.
So depending on what you want to achieve and works best for you, I would recommend to either:
• keep using the API to upload the data, but for the incremental load make sure you only provide the new and modified records in your CSV file in case of incremental load (and feel free to remove the x__timestamp column as this API is not using it)
(make sure you have a key defined in your dataset for incremental load to avoid duplicities)
• OR explore the possibilities of the Automated Data distribution and use the x__timestamp or timestamp in the filename to handle the increments