Roma Dubtsov

Results 12 comments of Roma Dubtsov

Thanks! I'll try to work around it then.

Hi, I'm one of the developers who worked on this package. I've looked at the [run.sh](https://github.com/soumith/convnet-benchmarks/blob/cpu/intel_optimized_technical_preview_for_multinode_caffe_1.0/run.sh) and the only suggestion I have is to enable OpenMP thread affinity by setting...

@andravin, > Are other processors (eg i7) affected by AVX2 frequencies, if so where can > we find documentation of the AVX2 frequencies for those processors? Probably the CPU support...

@ozabluda, Here's some data from a 2xE5-2697v3 machine (sorry, could did not have a desktop machine with a proper OS handy). My colleague timed IntelCaffe on 14 and 28 cores...

@andravin > > Are other processors (eg i7) affected by AVX2 frequencies, if so where can we find documentation of the AVX2 frequencies for those processors? > > Probably the...

Reproduced for 274be82 but not for tip of master. Is there a particular reason you cannot use the latest revision?

We certainly want to resolve this. We just are still bikeshedding the solution :) In DNNL we have scratchpads as well, so we are not sure if we need a...

Hello @ww5862. Thanks for the report. A few questions: 1. Can you please post output of test run after setting the environment variable `CUBLASLT_LOG_MASK=64` (e.g. `export CUBLASLT_LOG_MASK=64` in bash)? ([documentation](https://docs.nvidia.com/cuda/cublas/#cublaslt-logging))....

Hi @gcomes. Thanks for the report. I will certainly look into this, but unfortunately I am not able to dedicate any serious time to this project. The only thing I...

This has been fixed starting from cuBLAS 12.6 Update 2 (https://developer.nvidia.com/cuda-12-6-2-download-archive). Closing.