Rick Zhou comments

Results 9 comments of


                                            Rick Zhou

[KVCache] Support passing in attn_score_scaling_factor into KV cache

cc: @MasterJH5574 Will need https://github.com/flashinfer-ai/flashinfer/pull/126 to be merged first:

API should be exposed to ServiceWorker

@beaufortfrancois @tqchen Thanks a lot for putting the effort to support WebGPU in service worker. I was able to put up a sample Chrome extension running LLM in the service...

[SLM] Support BERT architecture. Implement a text embedding module

> please fix the jenkins here Should be addressed by https://github.com/mlc-ai/mlc-llm/pull/2292. I'm triggering a rebuild now

[SLM] Support BERT architecture. Implement a text embedding module

To fix CUDA error, https://github.com/apache/tvm/pull/16982

[Feature Request] Change OpenAI protocol default value to NOT_GIVEN

https://github.com/mlc-ai/mlc-llm/pull/2178

[Feature Request] Change OpenAI protocol default value to NOT_GIVEN

@tqchen PR to change this in JSONFFIEngine: https://github.com/mlc-ai/mlc-llm/pull/2225

[Question] How can I implement pytorch torch.stack, torch.outer equivalent function when define a new model?

Please take a look at existing operators at https://github.com/apache/tvm/blob/main/python/tvm/relax/frontend/nn/op.py. For example, torch.stack can be implemented with nn.op.unsqueeze + nn.op.concat

[Bug] mlc_llm serve error on Mac M1 (git clone failed with error 128)

Hi @pchalasani @shahizat I was not able to reproduce the same error on my Mac. I suspect this is due to git configuration issue. Can you try directly running: ```...

[Question] after pip install mlc-llm, No module named 'mlc_llm'

@kidhan1234 Please make sure you're using the correct Python interpreter. Compare `pip show mlc-llm-nightly-cu121` and `python -c "import sys; print(sys.path)"` to make sure that the correct Python path is included.