array-api icon indicating copy to clipboard operation
array-api copied to clipboard

finfo should not require float type for the result fields

Open asmeurer opened this issue 3 years ago • 8 comments

Right now finfo requires that the output fields be float https://data-apis.org/array-api/latest/API_specification/generated/signatures.data_type_functions.finfo.html#signatures.data_type_functions.finfo. However, making the results 0-D arrays would be better.

For the spec itself, float is fine, but it's problematic for any library that implements higher precision data types like float128:

>>> import numpy as np
>>> np.finfo(np.float128)
finfo(resolution=1.0000000000000000715e-18, min=-1.189731495357231765e+4932, max=1.189731495357231765e+4932, dtype=float128)
>>> float('-1.189731495357231765e+4932')
-inf

float is essentially a float64, so the various values for float128 cannot be represented as floats. NumPy uses scalars for the values, which for the spec would be 0-D arrays:

>>> type(np.finfo(np.float128).min)
<class 'numpy.float128'>

Even if there are no plans to add float128 to the spec, it's useful for libraries that do have it to have a consistent return type for finfo, since float and a 0-D array aren't 100% duck type compatible.

For iinfo obviously this problem isn't present since int can represent any integer, but it may be good to change it to 0-D array as well just for consistency (although I should note that numpy.iinfo just uses int for its fields).

asmeurer avatar Mar 18 '22 22:03 asmeurer

Thanks for opening this issue @asmeurer. Agreed that 0-D arrays probably make the most sense here. As we've already departed from NumPy behavior for reductions (e.g., sum et al returning 0-D arrays whereas the main namespace returns scalars), returning 0-D arrays for finfo and iinfo would be consistent.

Given that NumPy, CuPy, etc have the array API implementation under an experimental status, probably not too late to make this change for these libraries. However, Torch is already shipping finfo and iinfo according to the current spec: https://pytorch.org/docs/1.11/type_info.html?highlight=finfo#torch.torch.finfo, so moving to returning a 0-D array would be a breaking change.

@rgommers perhaps you have thoughts?

kgryte avatar Mar 21 '22 08:03 kgryte

This is not a proposal that I immediately think is a good idea, because:

  • There are no current issues, only a hypothetical future one. NumPy is the only library with a >64-bits dtype, and it has array scalars too so there's only a problem if it really becomes possible to completely remove array scalars.
    • Side note: I hope we can get rid of float96/float128 in NumPy, because they're a massive pain and a not-so-useful relic of the past.
  • It'd be a breaking change for every library that has finfo and wants to implement array API standard support in its main namespace (which long-term is everyone I hope), not just PyTorch.

If we designed this from scratch with no concern for backwards compatibility, then I'd agree 0-D arrays would be preferred.

rgommers avatar Mar 21 '22 10:03 rgommers

I don't have strong opinion here, but whatever we do for finfo it must be done consistently for other constants as well: https://data-apis.org/array-api/latest/API_specification/constants.html (We had a long discussion on the very same issue for constants, on my cell and couldn't find it immediately 😅)

leofang avatar Mar 21 '22 12:03 leofang

gh-154 is the discussion on constants. The reason for them to be included was different though: https://github.com/data-apis/array-api/issues/154#issuecomment-833746576. I don't think we ever discussed making constants 0-D arrays, only whether to include them or not.

rgommers avatar Mar 21 '22 12:03 rgommers

No we didn't discuss making them 0D arrays, but both pi, e, etc and finfo are constants that require uniform treatments in the standard. If there is desire to keep pi on GPU, it's equally possible there's desire to have eps stay on GPU 🙂

leofang avatar Mar 21 '22 14:03 leofang

(So, in short, I'm just adding another reason for finfo to keep floating type)

leofang avatar Mar 21 '22 14:03 leofang

One other concern similar to float128 that was brought up today is custom dtypes, for example a decimal type. NumPy does allow such things, but only as object dtype:

>>> import decimal
>>> d = decimal.Decimal('0.142857')
>>> x = np.array([d, d])
>>> x
array([Decimal('0.142857'), Decimal('0.142857')], dtype=object)
>>> np.finfo(x)
Traceback (most recent call last):
  Input In [12] in <cell line: 1>
    np.finfo(x)
  File ~/anaconda3/envs/dev/lib/python3.9/site-packages/numpy/core/getlimits.py:473 in __new__
    raise ValueError("data type %r not inexact" % (dtype))
ValueError: data type <class 'numpy.object_'> not inexact

@jakirkham mentioned that cuDF has a native decimal dtype, so that could in principle be made to work with finfo. That said, the solution is the same as for float128 (a custom object being returned, see below).

A >64-bit type like float128 or another custom dtype like decimal can always be supported as a custom object which duck types with a Python float. We had this same discussion for constants (see my comment above), and I think we've had it as well in places where the return value of a function is a tuple of arrays. In all those cases it is fine to return custom objects that duck type with the Python builtin types in the standard. One common reason that libraries may do this is to be able to keep the results on a GPU.

So the proposed solution to close this issue is: add a new section to the standard which explicitly mentions that it's fine to implement duck type objects, the reasons why libraries may do that, and give examples for constants (pi, e, inf, nan), finfo and a function that returns a tuple.

rgommers avatar Mar 24 '22 20:03 rgommers