Question: Adapting react-llm to other Llama 1/2 variants

Open naturalshine opened this issue 2 years ago • 0 comments

Hi, Thanks for the amazing project! I'm currently trying to adapt react-llm to work with a Llama2 variant I've quantized and transformed to wasm using the MLC AI LLM library (q4f32_1). I've updated background/index.js to reflect my model location / information:

    wasmUrl: "/models/Llama-2-7b-f32-q4f32_1/Llama-2-7b-f32-q4f32_1-webgpu.wasm",
    cacheUrl: "https://huggingface.co/maryxxx/Llama-2-7b-f32_q4f32_1/resolve/main/params/",
    tokenizerUrl: "/models/Llama-2-7b-f32-q4f32_1/tokenizer.model",

The extension loads the params from hugging face successfully. It registers the following error after loading, but still presents me with the prompt popup form: Uncaught (in promise) Error: Cannot find function encoding

Then, when I submit a prompt, I receive the following error: Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'generate')

It seems to me like something is off in the loading process, but I'm not sure where to start with debugging -- and if the problem is my wasm model or additional parameters I need to change in the codebase.

If possible, could you please provide an overview of how you prepared the example vicuna model (in case I am missing a step during compilation) and any additional hints re: other parameters that might need to be changed in the app?

Thank you!

Aug 13 '23 19:08 naturalshine