pytd icon indicating copy to clipboard operation
pytd copied to clipboard

Enable Writer to ingest array values

Open takuti opened this issue 6 years ago • 3 comments

Currently all array values in DataFrame will be converted into string, but it's not ideal.

takuti avatar Jul 22 '19 20:07 takuti

The most challenging part of this topic is in InsertIntoWriter since it requires carefully escaping quotes of array elements.

To make the situation simpler, it might be okay to support array column only in BulkImportWriter and SparkWriter first of all, because these writers directly load DataFrame / CSV file and automatically cares the type matters within the data format. This makes the behavior of BulkImportWriter / SparkWriter vs. InsertIntoWriter inconsistent though.

cc: @chezou

takuti avatar Nov 08 '19 03:11 takuti

Or, we could have a way to update schema after uploading the dataframe.

chezou avatar Nov 08 '19 05:11 chezou

Start implementation for BulkImportWriter.

Concerns for other writers are:

  • InsertIntoWriter requires tons of SQL escape
  • Introducing list handling in SparkWriter requires handling ArrayType which can be confusing for spark unfamiliar users.

chezou avatar Jan 29 '20 05:01 chezou