NVTabular icon indicating copy to clipboard operation
NVTabular copied to clipboard

[FEA] ListSlice() should accept padding on the left

Open gabrielspmoreira opened this issue 3 years ago • 4 comments

Describe the bug The ListSlice() op is currently able to pad the sequences on the right only. But there are use cases where you need to pad on the left. For example, when you want to keep only the last 10 elements of a user interactions sequence, you would use a negative start (i.e. ListSlice(-10)) and depending on how your model expects your data you would want the padded values to be used on the left of the sequences, so that the last position of all sequences always have a non-padded value

Steps/Code to reproduce bug

  • Try to use ListSlice(-10, pad=True, pad_value=0) in a sequence and padded values will be added to the right of the sequence

Expected behavior Turn pad bool arg into a string arg (i.e. pad="right") that can also accept pad="left"

gabrielspmoreira avatar May 17 '22 18:05 gabrielspmoreira

Hi team! Wonder if there's any updates on this ticket? With the Merlin-models' getting ready for sequential/session-based models, it's pretty crucial to pair it with Nvtabular to preprocess sequential data and feed into the models. I think pre-padding is way more common in sequential data preprocessing than post-padding, so it would be great if this issue can be prioritized to match with Merlin-models' development. Thanks!

zhiruiwang avatar Nov 02 '22 17:11 zhiruiwang

@gabrielspmoreira can you take a crack at this with help from @sararb and @rjzamora?

EvenOldridge avatar Nov 21 '22 16:11 EvenOldridge

Has this issue been resolved?

swapnilpanda avatar May 22 '23 07:05 swapnilpanda

@swapnilpanda we do not have left padding support in the ListSlice yet.

rnyak avatar May 25 '23 16:05 rnyak