Hi Team Hope all is well I was hoping to get some clarificat GoodData #gooddata-cn

Hi Team, Hope all is well. I was hoping to get s...

Igor Strupinskiy

05/12/2023, 1:14 AM

Hi Team, Hope all is well. I was hoping to get some clarification around a behavior of the GoodData SDK for python. Specifically:

Copy code

gd_result_table = sdk.tables.for_items(
            workspace_id=gd_workspace_id,
            items=attributes_to_query,
            filters=filters,
        )

When my result table has more than 10,000 entries, the above raises an exception. I was curious if that 10,000 was a hard limit, or something I could configure? If it is a hard limit — is there a way I can configure the above query to just return the top N rows instead, where N <= 10,000?

Jan Soubusta

05/12/2023, 7:28 AM

Do you use the Community Edition (single container deployment) or the Kubernetes deployment? The limit can be configured, just differently for each above mentioned deployment... https://www.gooddata.com/developers/cloud-native/doc/cloud/deploy-and-install/cloud-native/execution-limits/

Jan Soubusta

05/12/2023, 7:29 AM

Additionally, it is possible to add a TOP(X) filter to the request you send with Python SDK. @Jan Kadlec can you share more details about how exactly this can be achieved?

Jan Kadlec

05/12/2023, 8:17 AM

Hi Igor, I would like to recommend you to try out our other Python package gooddata-pandas which allows you to work with data using popular pandas data frames. Using gooddata-pandas you can use the following approach, which is not exactly what you wanted, but I think it is a nice workaround. It gives you TOP(n) or BOTTOM(n) values of metric. If you do not have any metrics in your report and you want to list only attributes, you can create virtual metric for count of attribute.

Copy code

from gooddata_pandas import GoodPandas
from gooddata_sdk import RankingFilter, ObjId

good_pandas = GoodPandas("<http://localhost:3000>", "YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz")

df_factory = good_pandas.data_frames("demo")


df_factory.for_items(
    items=dict(
            reg="label/region",
            category="label/products.category",
            price="fact/price",
            order_amount=ObjId(type="metric", id="order_amount"),
        ),
    filter_by=RankingFilter(
            metrics=[ObjId(type="metric", id="order_amount")],
            operator="TOP",
            value=10,
            dimensionality=[]
    )
)

Igor Strupinskiy

05/12/2023, 3:13 PM

Hi @Jan Soubusta — thank you for your response. I’m using the K8S deployment. @Jan Kadlec — thank you for the suggestion about gooddata-pandas. I initially chose the gooddata-sdk because I figured it would have the most first-party support. I will take a look at gooddata-pandas and make sure it has the rest of the functionality I need. In the meantime; is there no “top” or “bottom” functionality in the gooddata SDK that I’m using?

Jan Kadlec

05/12/2023, 3:16 PM

You should be able to use RankingFilter in

sdk.tables.for_items

as well 🙂

❤️ 1

Igor Strupinskiy

05/12/2023, 3:16 PM

@Jan Kadlec just to confirm; if I use the pandas package, would I be able to apply a “TOP” operator to a query if that query result set (without the “TOP”) would have contained greater than 10,000 entries?

Igor Strupinskiy

05/12/2023, 3:17 PM

Or would I still get an error?

Igor Strupinskiy

05/12/2023, 3:17 PM

Also thanks for letting me know about RankingFilter, I will take a look

Jan Kadlec

05/12/2023, 3:17 PM

If you apply the filter then the error should not occurred.

Igor Strupinskiy

05/12/2023, 3:18 PM

I will give it a try then! Thank you.

Jan Soubusta

05/12/2023, 3:28 PM

TOP(x) filters are even pushed down to your database. The error does not occur in this case for sure.

Igor Strupinskiy

05/12/2023, 3:30 PM

Fantastic. And does that apply to RankingFilter for sdk.tables.for_items also?

Jan Kadlec

05/12/2023, 3:32 PM

Yes, it applies.

Jan Soubusta

05/12/2023, 3:36 PM

This should be transparent across Python SDK libraries (gooddata-sdk, gooddata-pandas). For instance, if you see RankingFilter in one interface, it should work identically in all other interfaces where it is exposed. If you find any kind of inconsistency, please, report it here, we will promptly fix it 😉

Open in Slack

Previous Next