datasource_id from workbook connections does not match ids from datasources themselves
Describe the bug
After populating workbook connections, the datasource_ids in workbooks.connections.datasource_id don't appear to match with any of the ids from the datasources.id fields in our instance of Tableau Server.
It's totally possible I am being a dunce somehow or that there is some "duh" solution to this, but I can't find it. A few of us at work have tried to figure this out, but we think there might be something buggy going on.
Versions
- Tableau Server version = 3.10
- Python version = 3.9.13
- TSC library version = 0.23.4
To Reproduce
After running these two functions (inside a class, hence all the selfs):
def populate_all_workbooks(self):
self.all_workbooks = list(TSC.Pager(self.server.workbooks))
# Populate all workbooks with connections
for wb in self.all_workbooks:
self.server.workbooks.populate_connections(wb)
# Create data frame with all values from workbook.connections
def get_workbook_connection_df(self):
list_of_dicts = []
for wb in self.all_workbooks:
for connection in wb.connections:
list_of_dicts.append({
"id": connection.id,
"Type": connection.connection_type,
"ServerAddress": connection.server_address,
"UserName": connection.username,
"WorkbookId": wb.id,
"DataSourceId": connection.datasource_id
})
return pd.DataFrame(list_of_dicts)
Results
...Then we take the datasources.id from our most used datasource on our server instance (whose id I got directly in the XML with Postman, but also shows up querying it various ways through TSC). However, the following query yields nothing, zilch, nada:
# Our class is called TM
all_workbooks = TM.populate_all_workbooks()
workbook_connection_df = TM.get_workbook_connection_df()
# This yields no results. We did this many different ways, but this is the most concise version of the query
workbook_connection_df.query("DataSourceId == 'f78d82c3-c11f-49ce-ad8c-41b6e7d0990f'")
To be sure, we have looked for tons and tons of datasources.id's (in many different ways), and NONE of them are showing up in the list of workbooks.connections.datasource_id's that we have.
In fact, we created a dataframe of datasources similar to how we created the workbook_connections_df above, to wit:
def get_datasource_df(self):
list_of_dicts = []
for ds in self.all_datasources:
list_of_dicts.append({
"id": ds.id,
"Name": ds.name,
"Type": ds.datasource_type
})
return pd.DataFrame(list_of_dicts)
...and none of the thousands of id's generated above are present when compared to the datasource_id values from workbook.connections.
We have tried doing this a ton of different ways, editing our classes, scripts, putting the values into SQL and querying/joining tables that way, and nothing seems to work.
Final TL;DR Version
The workbooks.connections.datasource_id property doesn't seem to match datasources.id.
(Or I'm doing something wrong.)
Thanks in advance for all your help!
It's not you - that relationship is very strange and the ids do not represent the same thing. There are some (ugly) workarounds described in this open issue: https://github.com/tableau/server-client-python/issues/825
It's not you - that relationship is very strange and the ids do not represent the same thing. There are some (ugly) workarounds described in this open issue: #825
Thanks for the update. Working on getting access to the Metadata API.
Having said that are there any plans to rectify this in future releases of TSC?
Yes , observed the same . As the datasource ids are different for the datasource attached to a workbook and the same datasource when lookup by NAME and not id.For example Say DS1(which has 2 database connections in it) is connected to a Workbook W . Now , when we populate connections for W we see DS1(with a different datasource id - d1 and connection id - c1). Now when we lookup DS1 by NAME all_datasource_items, pagination_item = server.datasources.get() datasource_id=[ds.id for ds in all_datasource_items if ds.name == Tableau_datasource_name ]
we see that DS1 has different datasourceid and also 2 database connections within DS1 have connection ids ( c2 and c3) ) which is different from c1 . So in short , reference a datasource by its name and not datasourceid as the id is different