arm64 lanes
Hi there.
constexpr hn::ScalableTag<uint8_t> df;
constexpr hn::CappedTag<uint8_t, 1> d1;
constexpr size_t N = hn::Lanes(df);
On arm64 with neon, the N should be 32, but it's 16 when I debug, why?
BTW, here is my origin code use neon directly, what does its equivalent hwy code look like?
for (; dst_ptr <= dst_end - 8 * 4; dst_ptr += 8 * 4, src_ptr += 8 * 4) {
// load 8 pixels
uint8x8x4_t rgba = vld4_u8(src_ptr);
// multiply the color by alpha, expand to 16-bit
uint16x8_t r = vmull_u8(rgba.val[0], rgba.val[3]);
uint16x8_t g = vmull_u8(rgba.val[1], rgba.val[3]);
uint16x8_t b = vmull_u8(rgba.val[2], rgba.val[3]);
// (x + 127) / 255 == (x + ((x + 128) >> 8) + 128) >> 8
// This form is well suited to NEON:
// vrshrq_n_u16(...,8) gives the inner (x+128)>>8,
// vraddhn_u16() both the outer add-shift and our conversion back to 8-bit.
rgba.val[0] = vraddhn_u16(r, vrshrq_n_u16(r, 8));
rgba.val[1] = vraddhn_u16(g, vrshrq_n_u16(g, 8));
rgba.val[2] = vraddhn_u16(b, vrshrq_n_u16(b, 8));
}
// process the remaining pixels
for (; dst_ptr < dst_end; dst_ptr += 4, src_ptr += 4) {
dst_ptr[0] = (src_ptr[0] * src_ptr[3] + 127) / 255;
dst_ptr[1] = (src_ptr[1] * src_ptr[3] + 127) / 255;
dst_ptr[2] = (src_ptr[2] * src_ptr[3] + 127) / 255;
}
https://questdb.io/docs/reference/function/window/#avg
@nwoolmer I went through this document, and I guess VWMA will be implemented with the PARTITION BY clause in the same way, that's how the window will be defined for quantity and price.
Correct me if I am wrong here.
@siddharth0815 Yes! :)
@puzpuzpuz @nwoolmer Action items till now :
- Create a
VolumeWeightedMovingAverageFunctionFactoryclass which implementsFunctionFactoryand returnsisWindow()= truewhich accepts 2 sigArgs -vwma(DD) - Added it in
module-info.javaso that it can be loaded in factory list and get identified by token vwma - Change paramCount condition for window functions in
SqlCodeGenerator.generateSelectWindow()toast.paramCount > 2
Now I'm analysing what happens in :
- Factory instance creation
- How windowCursor fetches average
Please tell if this is the correct way to approach the problem or is anything I'm missing here, also if you want me to explore something in addition to the ones mentioned above, please mention.
You may use AvgDoubleWindowFunctionFactory as the starting point. It's worth running WindowFunctionTest with debugger and going through the calls to understand better when it's initialized and how it works.
@nwoolmer if you can add more hints here, please go ahead.
I'm hopeful that by looking at the pass methods in AvgOverPartitionFunction and derivatives, you can see where the 'meat' of the averaging is done. Those parts would be modified to handle the weighted average logic.
They are pretty complex functions so this is likely to take some time to get right!
Hello, I'm interested on working on this! Added a pull request recently with some of the above comments in mind, it is a bit dense (trying to add support for both partition-by and order-by clauses) but order-by seems to work. Unit tests and partition-by support is still WIP but wanted to see if the work is in the right direction :) The PR is here: https://github.com/questdb/questdb/pull/4749
Hi @bzhu23,
Sorry for any confusion here, @siddharth0815 was looking at this issue too.
Would you both consider collaborating on this together as PR co-authors? I am sure that with both of you developing and testing it, we will get an even better outcome.
We are always grateful for support from the community!
If not, no worries, we have other things to work on - or I can open issues for missing window functions if you have specific interest in them.
Thanks again!
@nwoolmer I can contribute in the same PR. :)
Hi @nwoolmer I went through the logic of the (avg) window function
Please validate my understanding here, post that I'll proceed with coding
Now basically we have 2 types of queries here :
- Computation on values (of some columns) contained in rows between given offsets
ROW OVER PARTITION - Computation on values (of some columns) in rows containing values between given range
RANGE OVER PARTITION
Most complicated case
The window is bounded from the lower end by a given offset/timestamp-diff and the higher end by some offset/timestamp-diff/current row.
Case Handling In such cases, We use 2 data structures :
-
Map: keeps track of values which will change in each iteration basis partition key (sum/avg/ring buffer offset) etc
-
Ring buffer: To keep track of old values which may/may not be considered in current computation
-
A
Ring buffertype data structure is used where memory locations are allocated by default for each partition key -
Starting offsets of those memory locations are mapped to the partition key in
Mapfor further reference -
Computable values of the records are put circularly, it is discarded (if out of range) and replaced as new index is calculated by modulo with
bufferCapacity
Questions
- Wanted to know in brief about how memory allocation works for the ring buffer (There's a case where we add more memory if buffer reaches capacity)?
- How is the calculated average/sum at each record iteration returned to the result (as
computeNext()is avoidfunction)?
Hey @siddharth0815 ,
Thank you for this and your efforts so far. It looks like you've got a good understanding of the issue.
Having reviewed this part of the codebase, we'd like to refactor and document it ourselves before taking contributions. Sorry for the confusion on this, this is partially my fault for putting it on the same list as easier tasks!
Closing this down for now, we have some alternate plans.