snowpark-python icon indicating copy to clipboard operation
snowpark-python copied to clipboard

SNOW-649339: Allow usage of a pre-configured NAMED File Format working with data in staged files

Open brian-kalinowski-sonos opened this issue 3 years ago • 1 comments

What is the current behavior?

Currently all the DataFrameReader file methods (.csv, .parquet, .json, etc) create temp file formats with options from python dicitionaries.

Even if the external stage has a default (name) file format attached the reader methods can fail when/if file format options are not passed to the DataFrameReader.

This also ties into the copy_into_table method.

What is the desired behavior?

Be able to pass a pre-configured file format name and/or have the DataFrameReader methods use the attached file format to the external stage.

How would this improve snowflake-snowpark-python?

  • Allow users not to have to re-configure file format options in snowpark, just use the file format/stage that's already set up in snowflake.

  • Mix and match different file formats between stages without having to reset all the file/copy options again.

References, Other Background

https://github.com/snowflakedb/snowpark-python/blob/main/src/snowflake/snowpark/_internal/analyzer/snowflake_plan.py#L685 Could bypass this query if a user passes a named file format, or the option to use the file format configured for the selected stage.

brian-kalinowski-sonos avatar Aug 19 '22 22:08 brian-kalinowski-sonos

We will support the file format name in copy_into_table method soon. cc @sfc-gh-mkeller

sfc-gh-jdu avatar Aug 22 '22 19:08 sfc-gh-jdu

https://github.com/snowflakedb/snowpark-python/pull/455 addressed this issue

sfc-gh-jdu avatar Sep 14 '22 00:09 sfc-gh-jdu