Samuel Oranyeli
Samuel Oranyeli
Hi @st-pasha, I think it would be helpful as well if a documentation could be created for code contributors , to know what the building blocks are, so to speak....
@vopani currently working on them ... cant specify an ETA though ... contributions are welcome also. You can test the `cumsum` function by downloading the latest dev version
@vopani Kindly create a minimal example of the cumsum for both datatable and pandas; it is easier to grok.
Thanks also to @oleksiyskononenko and @st-pasha for their guidance thru my C++ journey
tests comparison with `np.cumsum`, where numpy is twice as fast, and seems to be run on a [single core](https://stackoverflow.com/questions/49367278/any-way-to-speed-up-numpy-cumsum#comment85735538_49367278): ```py import numpy as np from datatable import dt, f In...
thanks @oleksiyskononenko , maybe you can explain more what you mean by parallelisation in terms of the actual data. How is profiling done in C++? by the way, is there...
@oleksiyskononenko I was reading up on cumulative sum, and found a possible performance option with Fenwick tree. What are your thoughts on it? is it worth the effort? As an...
@oleksiyskononenko @Peter-Pasta @vopani should we stick to the name `cumcount` or use `row_number`. I bring up `row_number` because of sql. checking to see which one is more intuitive
@oleksiyskononenko it is essentially row numbers, but it is more helpful when you want the row number per group ... #2892 is the reason behind the cumulative functions. In pandas...
Thanks for the feedback @vopani we are on the right track then with naming. I will go ahead and create the function with cumcount as the name