hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT] Error when selecting RT table on AWS Athena (0.14.1) - with custom Payload class

Open Hfal91 opened this issue 1 year ago • 1 comments

Env: AWS EMR on EKS 7.1 Hudi version: 0.14.1 Athena engine: v3 Table mode: MOR

This happens ONLY if I use custom Payload class. When RT and RO table aren’t synced, selecting RT table in Athena returns the below error: GENERIC_INTERNAL_ERROR: Exception when constructing record reader This query ran against the “xxx database, unless qualified by the query. Please post the error message on our forum 

Initially, i though that i was doing something wrong in my custom code, but then, i created the class empty, just extending the default OverwriteWithLatestAvroPayload, and i continue getting the error, so at this point, seems that only the fact that i'm using a Custom payload class is provoking the error (?)

I'm providing the main code that i'm using, as well as the "empty" custom payload class: main_code_attach.txt custom_java.txt

To reproduce, do one execution with BATCH 1 (commented in the main file), and do a second execution with BATCH 2. After it, if you query the RT table in Athena, you should see the error GENERIC_INTERNAL_ERROR: Exception when constructing record reader

Hfal91 avatar Aug 02 '24 16:08 Hfal91

Additionally, my current workaround is to run compaction at each commit: 'hoodie.compact.inline.max.delta.commits':'1'

Although this makes the RT table available for SELECT, it obviously loses its usability, as in this case the RT table is always synced with the RO table.

Hfal91 avatar Aug 02 '24 16:08 Hfal91