Yijie Shen

Results 16 issues of Yijie Shen

While implementing sort-merge join and spillable-repartition, I found my routine for output: 1. create a logical mutable record batch composed of mutable arrays of fixed batch size, 2. buffering input...

What do you think of implementing these extend* methods for MutableArrays from Arrays? I find these useful while implementing SortMergeJoin, where I meet the case: - extend_take_range: for the buffering...

**Goal: a complete row implementation, fully used in pipeline breaker operators when possible.** **Summary** TLDR: The key focus of this work is to speed up fundamentally row oriented operations like...

enhancement
datafusion
development-process
performance

Upstream issues: [TODO] Rust side: https://github.com/apache/arrow-rs/issues/1709 [Partly Finished?] Java side: https://issues.apache.org/jira/browse/ARROW-8672

performance related

bug
help wanted
development-process

Depends on the upstream issue: https://github.com/apache/arrow-datafusion/issues/2059

enhancement

![4011579595142_ pic_hd](https://user-images.githubusercontent.com/1387718/72792975-cfe53780-3c74-11ea-8e3f-81a9efe26999.jpg)

type/bug
workflow::todo
triage/week-16

_Moved from https://github.com/apache/arrow-datafusion/issues/1136_ Another good set might be timestamps (datetime.date, etc) but perhaps we can add those as a separate PR _Originally posted by @alamb in https://github.com/apache/arrow-datafusion/pull/1130#pullrequestreview-781435561_