wiedld
wiedld
For any questions on the big picture design, referred to the [diagrams here](https://github.com/wiedld/arrow-datafusion/pull/1). Note that^^ is a drafted followup PR. I've tried to incorporate some of the naming/wordage changes suggested,...
Run with the application of this slicing code (a single composable merge node) [branch here](https://github.com/wiedld/arrow-datafusion/pull/1). gcp c3d-standard-8-lssd debian-11 There are further confirmation steps, as well as hypotheses, as to why...
I think we should close this @alamb , since it's not a priority at the moment. And whenever we circle back, there will be a very large diff (due to...
Hypothetically, this could be a bad payload from the UI side. At least ruling out that option with some payload validation (see PR linked above).
Errors reoccurred. Is not due to the payload, is a runtime borrow bug -- which is rather difficult to chase down without more data. Leaving open for now. The errors...
> But I think @wiedld said she didn't have good luck with it so your mileage may vary While using the xcode allocations tool, I was getting In general I...
Ah, I forgot to mention a key point. When extracting data via heaptrack_print, I was looking at memory peaks and hence used `--flamegraph-cost-type peak`. You may want to check [other...
> I wonder if this could be related to DataFusion overriding the data_page_row_limit setting in https://github.com/apache/datafusion/issues/11367 (that @wiedld is working on) @alamb is mentioning the `data_page_row_limit` since in our own...
> I also found https://github.com/apache/arrow-rs/issues/5828 which might be related and/or relevant. @hveiga is correct that this is one suspected place with extra memory usage (specifically in the dict_encoder) when processing...
Have an alternative solution, done in the process of fixing https://github.com/apache/datafusion/issues/12119. PR up shortly.