Lain

Results 18 comments of Lain

I wrap the structures behind the `featureEncoder` with `with tf.control_dependencies([feature]):` and now the timeline result seems fine. It's nearly the same as sbnet_module.cuda_timer's result. However, the time cost of the...

Is there any other image source for English words?😨

I happened to find an image crawler capable of downloading images from bing/google/baidu/flickr etc. https://pypi.org/project/icrawler/0.1.1/

I also noticed that Youdao offers images for vocabularies and I wrote a crawler to fetch those images' links.

> The unittest should cover both cases (using scalar bmm scale or device bmm scale), @yzh119 At least [fp8](https://github.com/flashinfer-ai/flashinfer/blob/main/tests/attention/test_trtllm_gen_attention.py#L153-L154) is using tensor for bmm1_scale and bmm2_scale in the unit test...

@azhurkevich No, `group_size` is still a runtime argument at this stage. @NihalPotdar The condition `(args.group_size % size(TileShape{})) == 0` is too strong. I aggree with you. It forces the whole...

@NihalPotdar Oh sry I made a mistake. `SmemLayoutAtomScale` and `ScaleTileShape`'s second dimension being 1 is broadcasting 1 scale value to the whole K dimension of a tile. To make the...