dlpack icon indicating copy to clipboard operation
dlpack copied to clipboard

Signedness of attributes like `DLTensor::ndim`, `DLTensor::shape`

Open wjakob opened this issue 3 years ago • 2 comments

Dear DLpack authors,

I was curious why several definitions in dlpack.h, specifically various DLTensor attributes are signed, when negative-valued arguments would seem to indicate obviously nonsensical tensor configurations (such as negative dimensions or a negative shape along a dimension).

Would PR to change these to an unsigned counterpart be accepted? ABI-wise, there should be no impact as they occupy the same amount of memory (and values using the sign bit would, in any case, not correspond to valid configurations).

Thanks, Wenzel

wjakob avatar Apr 02 '22 09:04 wjakob

Thanks @wjakob! I agree it might be worthwhile to document the choice.

There are different arguments around using signed integers vs unsigned integers (search over the web and there are quite a few discussions around them).

Each side had some valid arguments, in short:

  • Unsigned enforces the invariance, and they are used in stl.
  • Explicitly choosing signed value avoids possible mistakes in underflow or undefined behavior when subtracting between shapes, and conversions in index difference calculations. Of course one could argue that they were programmer's burden, but nevertheless making things signed simplifies that perspective.

When making the data structure choice initially we felt that making things signed out-weights the value of additional 1bit. I believe this was also used by a few frameworks as well(e.g. PyTorch, TVM).

tqchen avatar Apr 02 '22 13:04 tqchen

Okay, that makes sense-- thanks for clarifying!

wjakob avatar Apr 04 '22 14:04 wjakob