reuster986

Results 28 comments of reuster986

@ronawho Thanks for looking into this. In our applications, we care much more about reading than writing, unfortunately. Most of our datasets come in as hundreds or even thousands of...

@ronawho excellent sleuthing, and thank you for the update! I have run into a lot of gotchas with HDF5, but I never considered that the setup/metadata operations might be bottlenecking.

Here's another discussion of HDF5 performance issues that focuses on file open/close: https://www.hdfgroup.org/wp-content/uploads/2018/04/avoid_truncate_white_paper_180219.pdf Section 2 ("Slow File Open") looks promising, and they have a fix for it, but frustratingly they...

Thanks @ronawho . What raises the importance of this edge case is that `-(2**63)` is the value of `pandas.NaT` (Not a Time), which is like `nan` for Datetime and Timedelta...

Thanks for the input, guys. I'm waiting to make a decision on this until we resolve https://github.com/Bears-R-Us/arkouda/issues/1013#issuecomment-1017591744 , since that will determine what the time-related classes look like.

I think I agree with @21771 that we should deviate from numpy in this specific case, in order to preserve the precision of uint64 values like hashes.

I propose we restructre the test directories as ``` test/ server/ modules/ diagnostic/ ``` The `server/` and `modules/` dirs would be meant for CI-compatible tests that return a success/fail code...

I forgot to explicitly state that the `test/server` dir would be for python tests that exercise the server, while the `test/modules` dir would be for Chapel unit tests that do...

@hokiegeek2 I think ideally we should support overwriting an individual dataset, but if it is too hard to implement, we don't need to make it a high priority.

This reminds me, I should make a benchmark for concatenate, which would definitely probe the effect of this option.