gtwuser comments

Results 9 comments of


                                            gtwuser

How to read existing hoodie written data from `S3` using `AWS Glue DynamicFrame` class. Failing with error with below error: An error occurred while calling o84.getDynamicFrame. s3://xxxx/.hoodie/202212312312.commit is not a Parquet file. expected magic number at tail

> @gtwuser I am not very familiar with Glue APIs. But, can you share the full stacktrace? Also, don't you need to give `connection_type` while creating dataframe? I am referring...

How to read existing hoodie written data from `S3` using `AWS Glue DynamicFrame` class. Failing with error with below error: An error occurred while calling o84.getDynamicFrame. s3://xxxx/.hoodie/202212312312.commit is not a Parquet file. expected magic number at tail

thanks @jharringtonCoupons closing it for now.

[SUPPORT]hudi how to upsert a non null array data to a existing column with array of nulls,optional binary. java.lang.ClassCastException: optional binary element (UTF8) is not a group

@nsivabalan @n3nash @umehrot2 @ Kindly suggest what should be done in this use case, we are stuck with this issue for 1 month now. Existing column schema in the hudi...

[SUPPORT]hudi how to upsert a non null array data to a existing column with array of nulls,optional binary. java.lang.ClassCastException: optional binary element (UTF8) is not a group

Small update i tried to drop the column with nulls during upsert. Scenario: 1. drop the column with all empty array during upsert and update table with same column name...

[SUPPORT]hudi how to upsert a non null array data to a existing column with array of nulls,optional binary. java.lang.ClassCastException: optional binary element (UTF8) is not a group

Thanks for getting back @xushiyan AWS glue supports Spark 3.1, but i suppose with `Hudi 0.11.0` bundle we get the parquet upgraded to `1.12`, unfortunately we are not able to...

[SUPPORT]Fail to write/merge when missing any nested fields in a struct. [An error occurred while calling o185.save. Can't redefine: element]

Even after adding `hoodie.datasource.write.reconcile.schema=true` we are still getting issue in different tables. This time the error is `Caused by: org.apache.avro.SchemaParseException: Can't redefine: list` Stack trace: ```bash 2022-09-30 20:08:29,172 ERROR [main]...

[SUPPORT]Fail to write/merge when missing any nested fields in a struct. [An error occurred while calling o185.save. Can't redefine: element]

Configs used: ```bash incrementalConfig = { 'hoodie.upsert.shuffle.parallelism': 68, 'hoodie.datasource.write.operation': 'upsert', 'hoodie.cleaner.policy': 'KEEP_LATEST_COMMITS', 'hoodie.cleaner.commits.retained': 10 } partitionDataConfig = { 'hoodie.datasource.hive_sync.partition_extractor_class': 'org.apache.hudi.hive.MultiPartKeysValueExtractor', 'hoodie.datasource.write.keygenerator.class': 'org.apache.hudi.keygen.CustomKeyGenerator', 'hoodie.datasource.write.partitionpath.field': 'year:SIMPLE, month:SIMPLE, day:SIMPLE, hour:SIMPLE, device_id:SIMPLE', 'hoodie.datasource.hive_sync.partition_fields': 'year,...

[SUPPORT]Fail to write/merge when missing any nested fields in a struct. [An error occurred while calling o185.save. Can't redefine: element]

Now we have also update the environment **Environment Description** Hudi version : 0.11.1 Spark version : 3.1 Storage (HDFS/S3/GCS..) : S3 Running on Docker? (yes/no) : no

[SUPPORT]Unable to acquire lock when parallelism grows

@nsivabalan @yihua and @danny0405 We are facing same issue as mentioned above with AWS Glue using below locking configs and just 2 writers. Actually we observed even with 1 writer...