Leonardo Solis V. comments

Results 10 comments of


                                            Leonardo Solis V.

POCL testing on multicore CPU

POCL has been used succesfully for multithreaded CPU executions (AMD and Intel). **So far, tested only Solis-Wets**. Documentation reporting this is still missing.

POCL testing on multicore CPU

A c5.18xlarge instance (72 vCPUs) running Ubuntu 18.04.1 was created on Jan 15th 2019.

Continuous Integration via Travis

Hi @atillack, > `NUMWI=1` I think might still be buggy (although there shouldn't be any hard requirements for `NUMWI=4`) Yes, `NUMWI`=1 might be buggy. With the current CI configuration, such...

Continuous Integration via Travis

@atillack I forgot to answer: > Do you have a working version of it somewhere? Not sure if I understand your question, but I assume you want to test the...

Continuous Integration via Travis

I updated the PR so that CI is based on GitHub actions instead of Travis. @atillack @diogomart @jeeberhardt, please have a look and provide feedback :)

Error Compiling Autodock

Hi @jssantiagojr, > nvcc fatal : Unsupported gpu architecture 'compute_80' Which GPU are you targeting?

Speeding up sum reductions in ADADELTA by using Tensor Cores

> @L30nardoSV Thank you very much, I am currently testing. Please encapsulate the code a bit and make it a compile option so older Cuda versions and cards still compile...

Speeding up sum reductions in ADADELTA by using Tensor Cores

@atillack Can you please check commit [b2ab3fe](https://github.com/ccsb-scripps/AutoDock-GPU/pull/252/commits/b2ab3fe6b79e08b7c1af5f11233070ac6c42d025) that incorporates an WMMA Extension for single precision matmul on Tensor Cores + error correction (TCEC)? `make DEVICE=GPU TESTLS=ad NUMWI=64 TARGETS=80 TENSOR=ON TCEC=ON...

Speeding up sum reductions in ADADELTA by using Tensor Cores

> While it looks like the search efficiency (@diogomart please test) might be OK now, overall there does not seem to be an actual speedup (if you normalize by the...