gtensor issues

sycl updates

adds benchmarks and improvements for sycl backend

half precision axpy

15

As of now, usage of half precision is not straightforward. Not only for extension libraries as mentioned in issue #266 , but also for the generation of standard kernels. E.g.,...

cmpfeil

CUDA host/device warnings (shape ctor not device)

2

Building codes that use `gt::adapt_device` can result in a lot of spurious warnings: ``` include/gtensor/gtensor_span.h(263): warning #20011-D: calling a __host__ function("gt::sarray ::sarray(const int *, unsigned long)") from a __host__ __device__...

bd4

Pr/sycl host copy

bd4

update sycl_ext_complex

1

Note that for some reason the gtensor specified AssignN kernel names no longer work with the additional template parameter on sycl ext complex type; the name is missing the void...

bd4

half precision support

While much of gtensor is type independent, the extension libraries like gt-blas, gt-fft, gt-solver, and the complex helpers, have some type specific details that may need to be modified to...

bd4

causing bugs: gtensor_span may not be contiguous

2

I've seen this in two places now: ```cxx template inline void gtensor_span::fill(const value_type v) { if (v == T(0)) { auto data = gt::backend::raw_pointer_cast(this->data()); backend::standard::memset(data, 0, sizeof(T) * this->size()); }...

germasch

bug

const stream objects

2

If a `stream` or `stream_view` object is const, it becomes useless, as no methods will be allowed. This came up in GENE where there is a helper class which has...

bd4

fortran: add gpuAllocatorClearCache

bd4

improve caching allocator

The current caching allocator has a separate cache per instance of the class, which is templated on ``. Separate per space is necessary, but per-type is not. It would be...

bd4

enhancement

gtensor
gtensor copied to clipboard

Metadata

sycl updates

half precision axpy

CUDA host/device warnings (shape ctor not device)

Pr/sycl host copy

update sycl_ext_complex

half precision support

causing bugs: gtensor_span may not be contiguous

const stream objects

fortran: add gpuAllocatorClearCache

improve caching allocator

← Metadata

Owner

Metadata

gtensor gtensor copied to clipboard

Metadata

← Metadata

Owner

Metadata

gtensor
gtensor copied to clipboard