Artyom Beilis issues

Results 26 issues of


                                            Artyom Beilis

Channel First Convolution Performance on OpenCL driver

# Summary Also oneDNN has good performance in channels last format (for example ./tests/benchdnn/benchdnn --engine=gpu:0 --mode=P --conv --stag=nhwc --cfg=f32 --dir=FWD_B mb64ic64ih56oc64oh56kh3ph1n Gives around 295.582 ~ 75% of GFlops for `Intel(R)...

documentation

Support of CuDNN8

Support of CuDNN8 Some of the API that was used by Caffe was removed in cudnn8. Without it it is impossible to run Caffe on Ampre architecture. It required: -...

Significant Performance reduction in latest OpenCL branch

### Issue summary The performance of OpenCL caffe branch had dramatically dropped from 73221fd37a5499f809796fac2ea95daba1a8ce02 to latest 3f2b97e93ed5ab612b6d00995294e37a422f0931 I can't say at what exactly point but the difference is significant: These...

Not receiving `GP_EVENT_CAPTURE_COMPLETE` event when capturetarget = Memory Card

**Describe the bug** When using trigger capture and waiting for events I receive `FILE_ADDED` but never receive `GP_EVENT_CAPTURE_COMPLETE` event when capturetarget = Memory Card, When capturetarget is set to internal...

the option USE_PYDLPRIM is defined but not used

the option `USE_PYDLPRIM` is defined but not used

clamp and clamp_min need to use self_c for X instead of self

bug

why x86 was removed?

It is very popular option for simulator.

Performance drop for speciifc tile size M=4096 N=4096 K=16

I'm using rx 560 16CU 4GB/gfx803 I run into performance issue when working with matrices of this specific size M=4096, N=4096, K=16, if I modify N to 4097 or 4095...

register vs rename privateuse1_backend

I notices in the code you use `register_privateuse1_backend('foo')` while in nighly it didn't work but `torch.utils.rename_privateuse1_backend('foo')` this works. Is it going to be the future API? I've updated my OpenCL...

Failure to run and build in nightly 2.4

Several issues 1. issues with changed const and missing `copy_data` from Allocator class 2. `torch.register_privateuse1_backend` -> `+torch.utils.rename_privateuse1_backend` However even with all that I still get failure during runtime: ``` Traceback...