Harry Huang
Results
2
issues of
Harry Huang
Hello, While using the `flashinfer_all2allv` backend with vLLM , I noticed that the `mnnvl_moe_alltoallv_prepare_without_allgather()` function returns the `prepared_local_scales` tensor with a torch.int32 data type. ([vLLM issue](https://github.com/vllm-project/vllm/issues/27655)) I was wondering if...
needs-triage