array-api icon indicating copy to clipboard operation
array-api copied to clipboard

Handling materialization of lazy arrays

Open hameerabbasi opened this issue 1 year ago • 10 comments

Background

Some colleagues and me were doing some work on sparse when we stumbled onto a limitation of the current Array API Standard, and @kgryte was kind enough to point out that it might have some wider implications than just sparse, so it would be prudent to discuss it with other relevant parties within the community before settling on an API design to avoid fragmentation.

Problem Statement

There are two notable things missing from the Array API standard today, which sparse, and potentially Dask, JAX and other relevant libraries might also need.

  • Support for storage formats.
    • In Dask, this might be the array metadata, such as the type of the inner array.
    • In sparse, this would be the format of the sparse array (CRS, CCS, COO, ...).
  • Support for lazy arrays/materialization
    • sparse/JAX might use this to build up kernels before running a computation
    • Dask might use this for un-computed arrays stored as a task graph.

Potential solutions

Overload the Array.device attribute and the Array.to_device method.

One option is to overload the objects returned/accepted by these to contain a device + storage object. Something like the following:

class Storage:
    @property
    def device(self) -> Device:
        ...

    @property
    def format(self) -> Format:
        ...

    def __eq__(self, other: "Storage") -> bool:
        """ Compatible if combined? """

    def __ne__(self, other: "Storage") -> bool:
        """ Incompatible if combined? """

class Array:
    @property
    def device(self) -> Storage:
        ...

    def to_device(self, device: Storage, ...) -> "Array":
        ...

To materialize an array, one could use to_device(default_device()) (possible after #689 is merged).

Advantages

As far as I can see, it's compatible with how the Array API standard works today.

Disadvantages

We're mixing the concepts of an execution context and storage format, and in particular overloading operators in a rather weird way.

Introduce an Array.format attribute and Array.to_format method.

Advantages

We can get the API right, maybe even introduce xp.can_mix_formats(...).

Disadvantages

Would need to wait till the 2024 revision of the standard at least.

Tagging potentially interested parties:

  • @jakirkham @tomwhite for Dask
  • @jakevdp for JAX
  • Please add anyone I missed

hameerabbasi avatar Feb 13 '24 11:02 hameerabbasi

I think this topic will have to be addressed in v2024, as it's too big to be squeezed in v2023 which we're trying very hard to wrap up 😅

leofang avatar Feb 13 '24 12:02 leofang

A few quick comments:

  • Storage formats are going to be specific to individual libraries, so I don't see any reasonable way to standardize anything. Shouldn't be a problem, it's not forbidden to have them, have different array types for them, or add new constructors or extra keywords to existing APIs (please do make them keyword-only to prevent future compat issues)
  • Lazy arrays are fine by themselves, they're supported. There are previous discussions on this, see for example gh-642, the topic: lazy/graph label, and https://data-apis.org/array-api/draft/design_topics/lazy_eager.html
  • Materialization via some function/method in the API that triggers compute would be the one thing that is possibly actionable. However, that is quite tricky. The page I linked above has a few things to say about it.

rgommers avatar Feb 13 '24 13:02 rgommers

I think this topic will have to be addressed in v2024, as it's too big to be squeezed in v2023 which we're trying very hard to wrap up 😅

No pressure. 😉

Materialization via some function/method in the API that triggers compute would be the one thing that is possibly actionable. However, that is quite tricky. The page I linked above has a few things to say about it.

Thanks Ralf -- That'd be a big help indeed. Materializing an entire array as opposed to one element is something that should be a common API across libraries, IMHO, I changed the title to reflect that.

hameerabbasi avatar Feb 15 '24 09:02 hameerabbasi

Cross linking https://github.com/data-apis/array-api/issues/728 as it may be relevant to this discussion.

kgryte avatar Mar 21 '24 07:03 kgryte

Materializing an entire array as opposed to one element is something that should be a common API across libraries, IMHO,

Just wanted to point out that it may be common but not universal. For instance, ndonnx arrays may not have any data that can be materialized. Such arrays do have data types and shapes and enable instant ONNX export of Array API compatible code. ONNX models are serializable computation graphs that you can load later, and so these "data-less" arrays denote model inputs that can be supplied at an entirely different point in time (in a completely different environment).

There are some inherently eager functions like __bool__ where we just raise an exception if there is no materializable data, in line with the standard. Any proposals around "lazy" arrays collecting values should have some kind of escape hatch like this.

adityagoel4512 avatar Jun 25 '24 18:06 adityagoel4512

I think that xarray ends up surfacing closely-related issues to this - see https://github.com/pydata/xarray/issues/8733#issuecomment-2249011104 for a summary of the problems.

TomNicholas avatar Jul 24 '24 22:07 TomNicholas

One thing I've been using is np.copyto.

  1. I create lazy arrays that allow slicing in a lazy fashion.
  2. I copy the results to the pre-allocated.

Pre-allocated arrays make a big difference in my mind in big data applications.

hmaarrfk avatar Jul 24 '24 22:07 hmaarrfk

Thanks for sharing @TomNicholas. I stared at your linked xarray comment for a while, but am missing too much context to fully understand that I'm afraid.

You're touching on to_device and __array__ there, so it looks like there's a "crossing the boundary between libraries" element - in the standard that'd also involve from_dlpack and asarray. For the problem of interchange between two arbitrary libraries like xp2.asarray(an_xp1_array), then:

  • when xp1 and xp2 are both eager: always works (assuming device/dtype etc. support is compatible),
  • xp2 is lazy and xp1 is eager: should work too
  • xp1 is lazy:
    • (A) works if xp1 triggers execution on a call that crosses a library boundary
    • (B) works if xp2 understands xp1 array instances and knows how to keep things lazy
    • (C) doesn't work if xp2 requires xp1 to materialize values in memory and xp1 refuses to do this automatically

Dask does (A), at least when one calls np.asarray on a Dask array with not-yet-materialized values, and I think that in general that's the right thing to do when execution is actually possible. Dask is fairly inconsistent in when it allows triggering execution though, it sometimes does so and sometimes raises.

@TomNicholas are you hitting case (C) here with Xarray? And if so, is that for interchange between libraries only, or something else?

rgommers avatar Jul 25 '24 09:07 rgommers

One thing I've been using is np.copyto.

@hmaarrfk we should be adding a copy function indeed I think, xref gh-495 for that. Note that that isn't inherently eager or requires triggering compute though.

rgommers avatar Jul 25 '24 09:07 rgommers

Thanks for sharing @TomNicholas. I stared at your linked xarray comment for a while, but am missing too much context to fully understand that I'm afraid.

Sorry @rgommers ! I'll try to explain the context here:

In xarray we try to wrap all sorts of arrays, including multiple types of lazy arrays. Originally xarray wrapped numpy arrays, then it gained an intermediate layer of its own internal lazy indexing classes which wrap numpy arrays, then it also gained the ability to wrap dask arrays (but special-cased them).

More recently I tried to generalize this so that xarray could wrap other lazily-evaluated chunked arrays (in particular cubed arrays, which act like a drop-in replacement for dask.array).

A common problem is different semantics for computing the lazy array type. Coercing to numpy via __array__ isn't really sufficient because often there are important parameters one might need to pass to the ".compute" method (e.g. which dask scheduler to use). Xarray currently special-cases several libraries that have different semantics, and also has a framework for wrapping dask vs cubed compute calls.

Dask and Cubed are also special in that they have .chunks (and .rechunk). Computation methods also often need to be specially applied using functions which understand how to map over these chunks, e.g. dask.array.blockwise.

More recently again we've realised there's another type of array we want to wrap: chunked arrays that are not necessarily computable. This is what that issue I linked was originally about.

The comment I linked to is trying to suggest how we might separate out and distinguish between all these cases from within xarray, with the maximum amount of things "just working".

You're touching on to_device and array there, so it looks like there's a "crossing the boundary between libraries" element

Not really - I'm mostly just talking about lazy/duck array -> numpy so far.

TomNicholas avatar Jul 25 '24 23:07 TomNicholas