lance icon indicating copy to clipboard operation
lance copied to clipboard

bug: spawn multiple encoding tasks inside one batch number of rows of one column

Open broccoliSpicy opened this issue 1 year ago • 0 comments

Currently, we spawn only one encoding task per batch number rows in a single column, which can be problematic for datasets like MS MARCO where the third column is significant larger than the other two(more than 100x).

larger column needs more encoding tasks.

broccoliSpicy avatar Jul 03 '24 20:07 broccoliSpicy