kawakami-k comments

Results 37 comments of


                                            kawakami-k

Why get_num_cores() in aarch64 only return 1?

Sorry for inconvenience. I'll check the issue at this week end. If you can provide additional information, as nSircombe points out, could you please give me.

Why get_num_cores() in aarch64 only return 1?

> hi sir, I'm trying to using my own thread_pool to replace the omp_thread_pool. During debug, I find that `get_max_threads_to_use()` in `oneDNN-2.2\src\cpu\platform.cpp` always return 1 in aarch64. Is that any...

Why get_num_cores() in aarch64 only return 1?

I am aware that this Issue has been assigned to me. Please give me some more time so I can add the feature.

rfc: quantization: extending scaling support

Hi @igorsafo, Thanks for the heads-up. It's LGTM. Are there any other RFCs or documents regarding API changes in version 3.0?

ld4 on (v0 - v3)

Thank you for your comment. README.md may be incorrect. The below may be correct. ``` ld4((v0.b - v3.b), ptr(x0)); ``` I'll check and fix it.

ld4 on (v0 - v3)

Sorry, `ld4((v0.b - v3.b), ptr(x0));` is typo. `ld4((v0.b16 - v3.b16), ptr(x0));` and `ld4((v0.b8 - v3.b8), ptr(x0));` are correct. Syntax of Xbyak_aarch64 is defined to be as close as possible to...

How to use mnemonics to express "prfm pldl1keep, [x1, #256]".

Although the default branch of xbyak_aarch64 is set to the fjmaster, please use the main branch. We plan to use the main branch as main stream. Please try the following...

How to use mnemonics to express "prfm pldl1keep, [x1, #256]".

It takes less than a second to generate the code. The overhead of code generation is negligible, as deep learning takes a order of hours to days.

How to use mnemonics to express "prfm pldl1keep, [x1, #256]".

Please use "main" branch. We plan to overwrite "fj_master" branch with "main" branch in the near future. You can use 'xbyak_aarch64_code_array.h:dd(uint32_t code)' to write single 32-bit data at the current...

How to use mnemonics to express "prfm pldl1keep, [x1, #256]".

Yes, it does. Please be careful not to overflow the code size. The initial code size can be given in the constructor. ```C++ #include "xbyak_aarch64.h" using namespace Xbyak_aarch64; class Generator...