Artyom Beilis
Artyom Beilis
# Summary Also oneDNN has good performance in channels last format (for example ./tests/benchdnn/benchdnn --engine=gpu:0 --mode=P --conv --stag=nhwc --cfg=f32 --dir=FWD_B mb64ic64ih56oc64oh56kh3ph1n Gives around 295.582 ~ 75% of GFlops for `Intel(R)...
Support of CuDNN8 Some of the API that was used by Caffe was removed in cudnn8. Without it it is impossible to run Caffe on Ampre architecture. It required: -...
### Issue summary The performance of OpenCL caffe branch had dramatically dropped from 73221fd37a5499f809796fac2ea95daba1a8ce02 to latest 3f2b97e93ed5ab612b6d00995294e37a422f0931 I can't say at what exactly point but the difference is significant: These...
**Describe the bug** When using trigger capture and waiting for events I receive `FILE_ADDED` but never receive `GP_EVENT_CAPTURE_COMPLETE` event when capturetarget = Memory Card, When capturetarget is set to internal...
the option `USE_PYDLPRIM` is defined but not used
It is very popular option for simulator.
I'm using rx 560 16CU 4GB/gfx803 I run into performance issue when working with matrices of this specific size M=4096, N=4096, K=16, if I modify N to 4097 or 4095...
I notices in the code you use `register_privateuse1_backend('foo')` while in nighly it didn't work but `torch.utils.rename_privateuse1_backend('foo')` this works. Is it going to be the future API? I've updated my OpenCL...
Several issues 1. issues with changed const and missing `copy_data` from Allocator class 2. `torch.register_privateuse1_backend` -> `+torch.utils.rename_privateuse1_backend` However even with all that I still get failure during runtime: ``` Traceback...