w1005444804
w1005444804
> OK,Sir make sure what operation your net use; in sonme hardware operation like convolution, ncnn maybe use pack,which means Change memory layout to accommodate L1 cache of hardware,or other...
测试发现ConvolutionDepthWise 比Convolution耗时高一截???
@lyogavin
void convolution_node::backward2data(const dnnl::memory& diff_dst) { m_src_diff_md = dnnl::memory::desc(m_src_dims, dt::f32, tag::any); m_weights_diff_md = dnnl::memory::desc({ m_weights_dims }, dt::f32, tag::any); m_dst_diff_md = dnnl::memory::desc({ m_dst_dims }, dt::f32, tag::any); // // std::cout
dnnl::convolution_backward_data is quite time-consuming; infer cost(ms): 10 backward2data cost(ms): 232 (however pytorch or libtorch cost(ms) 30~50) backward2weights cost(ms): 12
@igorsafo thanks, Activate ONEDNN_ VERBOSE does have a certain effect, but it is very unstable, and the time consumption has changed from the previous 230ms to a dynamic range of...
Hi @igorsafo , Is the problem caused by me?
@igorsafo Yes, It is the first layer, my model is a conv-layer, I just wanted to test the speed of forward and backward propagation of convolutions, and then found this...
The code is roughly as follows: ... dnnl::memory::dims conv1_src_tz = { 10, 3, 160, 160 }; auto conv1_src_memory = dnnl::memory({ {conv1_src_tz}, dt::f32, tag::nchw }, engine); convolution_node conv1(engine, 3, 6, 5,...
@heyoeyo You're right. but I want to know if it's possible to choose different shapes for the input of VIT(encoder) .