server-client-python icon indicating copy to clipboard operation
server-client-python copied to clipboard

Issue with Downloading Large Views from Tableau Server to CSV

Open abhishek7575-spec opened this issue 1 year ago • 4 comments

Describe the bug I'm encountering an issue when trying to download a view that is around 200 MB in size. My script works perfectly for smaller files (5 MB or 10 MB), but it keeps running indefinitely and doesn't write data into the CSV file for the larger view.

The script successfully logs "CSV populated," indicating that server.views.populate_csv completes, but the file is not written for larger views.

Here's the code I'm using:

def download_file(client, server, content_url, output_file, region):

    # NOTE: Monkey patched TSC.RequestOptions.Field class

    # since the contentUrl filter parameter was not supported

    setattr(TSC.RequestOptions.Field, 'ContentURL', 'contentUrl')

    req_option = TSC.RequestOptions()

    req_option.filter.add(TSC.Filter(TSC.RequestOptions.Field.ContentURL,

                     TSC.RequestOptions.Operator.Equals,

                     content_url))

    logger.info('Fetching views...')

    views, pagination_item = server.views.get(req_option)

    logger.info(f'Views fetched: {len(views)}')

    if len(views) == 1:

      only_matched_index = 0

      view = views[only_matched_index]

      logger.info('Populating CSV...')

      server.views.populate_csv(view, TSC.CSVRequestOptions(maxage = 5))

      logger.info('CSV populated.')

      with open(output_file, 'wb') as f:

        f.write(b''.join(view.csv))

      logger.info(f"CSV DATA WRITTEN TO the {output_file}")

 

OUTPUT:

INFO:main:Fetching views...

INFO:main:Views fetched: 1

INFO:main:Populating CSV...

INFO:main:CSV populated.

Could anyone suggest what might be going wrong with handling larger views? Are there any additional steps or considerations when dealing with larger files in Tableau Server that I might be missing?

Any insights or suggestions would be greatly appreciated!

Thank you!

Versions Details of your environment, including:

  • Python version:3.9
  • TSC library version:0.30

abhishek7575-spec avatar May 20 '24 10:05 abhishek7575-spec

Hm, not sure I've tried this. Most likely the problem is in the library here, and the request is timing out - there's no logic here to extend the session. I'll try and reproduce it to see exactly what happens.

jacalata avatar May 22 '24 22:05 jacalata

@jacalata Thank you for looking into this and for trying to reproduce the issue. Looking forward to hearing back from you.

abhishek7575-spec avatar May 23 '24 04:05 abhishek7575-spec

@abhishek7575-spec in v0.30, ContentUrl is absolutely available.

I would also consider changing the line f.write(b''.join(view.csv)). Its a nifty one-liner if the data is small enough and can easily fit into RAM, but you could run into problems with bigger files, as well as not being able to troubleshoot as easily. You could do this for example:

for i, chunk in enumerate(view.csv):
    logger.debug("Writing chunk %s", i)
    f.write(chunk)

It would give you log output so you can see that its still working and not have to load the entire file into RAM at once.

jorwoods avatar May 30 '24 18:05 jorwoods

@abhishek7575-spec : I'm also facing the same issue...Were you able to get this resolved?

SenthamizhCSS avatar Sep 15 '24 08:09 SenthamizhCSS