vortex icon indicating copy to clipboard operation
vortex copied to clipboard

add scalars_dtype to ExtDType

Open a10y opened this issue 1 year ago • 0 comments

Adds scalars_dtype to ExtDType.

This PR adds scalars_dtype (alternative name options: canonical_dtype, storage_dtype) to ExtDType.

This is desirable for a few reasons

  • Makes it possible to canonicalize an empty chunked ExtensionArray
  • Makes it possible to determine the storage DType for a ConstantArray without examining its value
  • Makes it possible for Vortex to reason about externally authored extension types. This is still not fully complete, as an ideal experience would allow extension authors to override IntoCanonical, IntoArrow, Display, etc.

To avoid duplicating the nullability, we remove top-level nullability from the DType::Extension variant, instead nullability is accessed through the inner ExtDType.

This has the unfortunate effect of bringing size_of::<DType>() from 40 -> 48 bytes, which obviously makes every array's metadata 8 bytes larger.

a10y avatar Oct 09 '24 19:10 a10y