Jon Vexler issues

Results 43 issues of


                                            Jon Vexler

[HUDI-4734] Deltastreamer table config change validation

### Change Logs Table configuration properties were not being validated in deltastreamer so changes to properties that shouldn't change were being allowed. To fix this, we now call the validator...

priority:critical

deltastreamer

[MINOR] Fixes to make unit tests work on m1

### Change Logs an older version of snappy that is incompatible with m1 was used in by a few dependencies. To fix we make spark version to be 2.4.8 and...

[WIP] [HUDI-6787] Implement the HoodieFileGroupReader API for Hive

### Change Logs Replace existing hive read logic with filegroup reader HoodieFileGroupReader is the generic implementation of a filegroup reader that is intended to be used by all engines. I...

release-1.0.0

size:XL

release-1.0.0-beta2

[HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark

### Change Logs - Create Spark Parquet reader inside of the reader context - Eliminate parquet reader map - Eliminate parallel implementation of setting up reader - Schema.on.read for filegroup...

reader-core

release-1.0.0

[HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark

### Change Logs iter.map as well as iterator.toScala were not calling close. We now don't use those. ### Impact Don't leak memory (at least here) ### Risk level (write none,...

[HUDI-7236] Allow MIT to change partition path when using global index

### Change Logs ExpressionPayload is necessary for MIT, but when partition path is changed for a record it interferes. This pr fixes this feature by reading existing records with the...

priority:critical

release-0.14.1

[HUDI-7413] make schema errors better

### Change Logs Provide explanation for what is wrong with incoming data schema. Add tests to ensure schema validation remains stable. ### Impact Easier for users to understand what is...

[HUDI-1407] Basic python reader for Hudi

### Change Logs Works on COW tables. Tables can have clustering. ### Impact Can read hudi tables with arrow which can be used with pandas, numpy etc. ### Risk level...

release-1.0.0

[HUDI-7386] add builder to filegroup reader

### Change Logs Add builder so we can have more params ### Impact Mistakes will be made when there are so many params for the fg reader. ### Risk level...

[HUDI-7385] Add config for custom write support for parquet row writer

### Change Logs Allow custom write support for spark parquet row writer that extends HoodieRowParquetWriteSupport. Use `org.apache.hudi.io.storage.row.HoodieRowParquetWriteSupport` to set the custom write support. ### Impact Allows for users to customize...