BitNet
BitNet copied to clipboard
why arm don't support tl2 kernel
Hi ,
can i know why arm don't support tl2 kernel? i guess the simd instruction no support?
because TL2 optimized for model file size, to reduce memory io, while on ARM cpu the memory io is not a bottleneck for such 2B models.
Thanks for your response!!
So using I2_s and TL1 can decrease lantency, TL2 can't improve it? if the simd lane can be 32 or 64, is it helpful?
TL2 is faster than the others for larger b1.58 models (e.g. 70B or 100B) as from our experiment, however currently we do not have a checkpoint with that size.