Jon Vexler
Jon Vexler
### Change Logs Table configuration properties were not being validated in deltastreamer so changes to properties that shouldn't change were being allowed. To fix this, we now call the validator...
### Change Logs an older version of snappy that is incompatible with m1 was used in by a few dependencies. To fix we make spark version to be 2.4.8 and...
### Change Logs Replace existing hive read logic with filegroup reader HoodieFileGroupReader is the generic implementation of a filegroup reader that is intended to be used by all engines. I...
### Change Logs - Create Spark Parquet reader inside of the reader context - Eliminate parquet reader map - Eliminate parallel implementation of setting up reader - Schema.on.read for filegroup...
### Change Logs iter.map as well as iterator.toScala were not calling close. We now don't use those. ### Impact Don't leak memory (at least here) ### Risk level (write none,...
### Change Logs ExpressionPayload is necessary for MIT, but when partition path is changed for a record it interferes. This pr fixes this feature by reading existing records with the...
### Change Logs Provide explanation for what is wrong with incoming data schema. Add tests to ensure schema validation remains stable. ### Impact Easier for users to understand what is...
### Change Logs Works on COW tables. Tables can have clustering. ### Impact Can read hudi tables with arrow which can be used with pandas, numpy etc. ### Risk level...
### Change Logs Add builder so we can have more params ### Impact Mistakes will be made when there are so many params for the fg reader. ### Risk level...
### Change Logs Allow custom write support for spark parquet row writer that extends HoodieRowParquetWriteSupport. Use `org.apache.hudi.io.storage.row.HoodieRowParquetWriteSupport` to set the custom write support. ### Impact Allows for users to customize...