Hi, I have a parent-child (one to many) dataset. I...
# gooddata-cn
k
Hi, I have a parent-child (one to many) dataset. I am using Python SDK to query parent and child fields, the sample code is pasted below. The for_items function returns the result as flattened rows, it means I have to rebuild the parent-child relationship in my code. Is there way to get the result in unflatten parent-child format?
from gooddata_sdk import sdk, Attribute
sdk = gooddata_sdk.GoodDataSdk.create(host, token)
items = [
Attribute(local_id="parent_field_P", label="parent.field_P"),
Attribute(local_id="child_field_C", label="child.field_C"),
]
table = sdk.tables.for_items(workspace_id, items=items)
for result in table.read_all():
print(result)
j
Hi, I think that gooddata-pandas is more suitable for your use case.
Copy code
from gooddata_pandas import *
from gooddata_sdk import Attribute
good_pandas = GoodPandas(host, token)
df_factory = good_pandas.data_frames(workspace_id)
df_factory.for_items({"category":Attribute(local_id="abc", label="campaign_channels.category"), "name": Attribute(local_id="xyz", label="campaign_name" )})

# or you do not need Attribute object at all

df_factory.for_items({"category": "label/campaign_channels.category", "name": "label/campaign_name"})
Both approaches return pandas.DataFrame.
Copy code
category                   name
0    Advertising     2015 Bamity 1AZ713
1    Advertising    2015 Bitchip 5FY971
2    Advertising    2015 Fintone 2PG648
3    Advertising  2015 Gembucket 6NG829
4    Advertising     2015 Kanlam 5YM478
..           ...                    ...
138          Web         2017 It 9XF277
139          Web  2017 Lotstring 4RF728
140          Web   2017 Tempsoft 4AZ168
141          Web    2018 Quo Lux 2CJ961
142          Web  2018 Ronstring 9LY887
k
I am getting the circular import error for good data_pandas. ImportError: cannot import name 'GoodPandas' from partially initialized module 'gooddata_pandas' (most likely due to a circular import)
j
What version do you use?
k
gooddata-pandas 1.3.0 gooddata-sdk 1.3.0
I just ran
pip install gooddata-pandas
j
Unfortunately, I am not able to simulate the error. I created a new virtualenv and
pip install gooddata-pandas
. The following code did not raise any error.
Copy code
>>> from gooddata_pandas import *
>>> pandas = GoodPandas("<http://localhost:3000>","YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz")
>>> df_factory = pandas.data_frames("demo")
Could you please send me the code that raises the error?
k
from gooddata_pandas import GoodPandas
good_pandas = GoodPandas("host", "token")
Are not you getting "NameError: name 'GoodPandas' is not defined" for the code you shared above?
j
No.
Copy code
.venv in ~/Downloads python
>>> from gooddata_pandas import GoodPandas
>>> good_pandas = GoodPandas("host", "token")
>>>
k
I also tried creating a new virtual environment. No luck. What's your gooddata_pandas and good data-sdk version?
j
Copy code
gooddata-afm-client      1.3.0
gooddata-api-client      1.3.0
gooddata-metadata-client 1.3.0
gooddata-pandas          1.3.0
gooddata-scan-client     1.3.0
gooddata-sdk             1.3.0
All gooddata packages
k
@Jan Kadlec the issue was related to python version. Its working now. Coming back to original question I still see the data flattened in pandas dataframe too.. I also tried .to_dict('records') on result dataframe. I am expecting data in this format
Copy code
[{"category: "Advertising", "name: ["value1", "value2"] , {"category: "Web", "name: ["value3", "value4"] }
j
I am happy that it works for you now. I see what is your goal. Unfortunately, you cannot get your expected data out of the box using GoodPandas, but you need to do some post processing. Try the following:
x.set_index("name").groupby("category").groups
The result is:
Copy code
{'Advertising': ['2015 Bamity 1AZ713', '2015 Bitchip 5FY971', ...], 'Content': ['2015 Bitchip 5FY971', '2015 Cookley 7NR903', ...], ...}