Leon Lu comments

Repositories
Issues
Comments

Results 3 comments of


                                            Leon Lu

[Tracking Issue] UMA: Universal Modular Accelerator Interface

Hi, thanks for the work. Just want to know if there is a rough schedule on these features development?

[Bug] gemma init crash: AttentionKVCache expected 19 arguments but 18 were provided

Similar issue meet for llama3.1, which report the error below: TVMError: Function flashinfer.attention_kernel_prefill_with_ragged_kv_cache_begin_forward(0: DLTensor*, 1: DLTensor*, 2: int64_t, 3: int64_t, 4: int64_t, 5: int64_t, 6: void*) -> void expects 7...

[Bug] gemma init crash: AttentionKVCache expected 19 arguments but 18 were provided

@mengshyu , I do not use APK, just use command line in Ubuntu20.04, and before I upgrade the mlc-llm code base, the same model works.