rocking comments

Results 24 comments of


                                            rocking

Low GPU and CPU Usage while Inference / realtime detection

I also have a similar issue. In, Tensorflow 1.5, very low GPU util and run slower than CPU. ![screenshot from 2018-03-22 10 46 38](https://user-images.githubusercontent.com/9115697/37748532-238d1e64-2dbf-11e8-9a4e-e444708daeae.png) However, in Tensorflow 1.4. The GPU...

Pointwise kernel choose grid size based on number of CU

Elementwise and maxpool backward kernel suffer from this issue. As discussed with @qianfengz , this might need to modify StreamConfig

elementwise operation Support 1D~5D

https://ontrack-internal.amd.com/browse/LWPCK-190

elementwise operation Support 1D~5D

@ppanchad-amd As mention in the https://ontrack-internal.amd.com/browse/LWPCK-190 We can close this ticket

add support for AMD / ROCm / HIP

I just submit an PR to support AMD / ROCm on FlashAttention 2 https://github.com/Dao-AILab/flash-attention/pull/1010 This PR using [composable_kernel](https://github.com/ROCm/composable_kernel) as backend

add support for AMD / ROCm / HIP

@wsippel Yes, The new PR only works for MI200 and MI300 for now.

add support for AMD / ROCm / HIP

> I have mi100s, would love to be able to use them We found MI100 may fail in some of the bf16 test cases. Hence, MI100 is not officially support...

add support for AMD / ROCm / HIP

> I would like to look into this bf16 issue. Is the cause well understood or in need of research? We focus on MI300 improvement recently, but MI100 is still...

add support for AMD / ROCm / HIP

> I would like to concur with ehartford. I'm trying to get the AMD folks to provide more info on the cause of a page fault during the tests which...

add support for AMD / ROCm / HIP

> I have 24 mi100s, I would much want to add support for mi100s, Is there anything I can do to help? @ehartford You should ask your AMD sales to...