Mateusz P. Nowak
Mateusz P. Nowak
Optimization failed, the critical loop (in sort.hpp, splitters() function) seems impossible to parallelize for effective GPU execution
Suspended, passed to @lslusarczyk The status: - qsub + mpirun working, running the multinode benchmarks (single node temporary disabled in the branch) - plotter generating only part of figures -...
Problem with assert in intel_transport_send.h at line 2012 is solved in IMPI 2021.11 (tested on devcloud, with IMPI 2021.11 installed in home dir)
I_MPI_OFFLOAD=0 mpirun -n 2 ./build/benchmarks/gbench/mhp/mhp-bench --sycl --benchmark_filter=Sort_DR -> Assertion failed in file ../../src/mpid/ch4/shm/posix/eager/include/intel_transport_recv.h at line 1175: cma_read_nbytes == size However, with I_MPI_OFFLOAD=1 (which should be used with IMPI on GPU)...
Tests added in #300
- implementation of distributed_vector (#103 ) must be finished - then test mhp::transform() on the above vector - actually implement stencil-2d analogous to stencil-1d
With Robert' support: distributed_vector done in this task, and stencil-1d-array example done in this taks. New tasks created to cover implementation of real distributed_dense_matrix
@intel/llvm-reviewers-runtime this is friendly invitation to review
Fails also for ARL integrated gpu, disabled in #20890