Ben Corbett
Ben Corbett
I have a project where we compute a time step on the GPU and then asynchronously copy some data back to the host for later use. This copy overlaps with...
I'd like to use RAJA's atomics with on CUDA's 16 bit half precision values. Currently the implementation only supports 32 and 64 bit values.
When building 2.6.0 with CUDA 101.243 on Lassen I get the following error when linking the the user application ``` /usr/bin/ld: warning: libcupti.so.10.1, needed by /usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/caliper-2.6.0-olcrcfanb2ezov4spvbsu732we7f7qc5/lib64/libcaliper.so.2.6.0, not found (try using...
I figured out a way to greatly reduce the extra test output when running gtest with MPI. All the common stuff like ``` [ RUN ] BoundaryID.info [ OK ]...
**Is your feature request related to a problem? Please describe.** Duplicate (ghosted) nodes are output in the chombo coupling file. **Describe the solution you'd like** No duplicate nodes. The proposed...
Improve packing tests using `TYPED_TEST` to test multiple policies and data types.
When we're not using pinned memory our packing buffers are standard host allocations. They are written to directly from the GPU and this only works on Lassen because of the...
On Quartz with clang 10.0.0, I imagine it's a problem on other platforms too. I tried with 6-10 ranks, the problem is 10^3 so you'd at least hope that 10...
As of https://github.com/GEOSX/GEOSX/pull/1143 we are pre-allocating 200 edges per node among other things. This is a huge waste of space for internally generated meshes as well as many if not...
Host configs that disable CUDA `set( CUDA_ENABLED OFF )` when they should be using `ENABLE_CUDA`. There are a few other places as well where `CUDA_ENABLED` is used. We should get...