Question: Adapting react-llm to other Llama 1/2 variants
Hi,
Thanks for the amazing project!
I'm currently trying to adapt react-llm to work with a Llama2 variant I've quantized and transformed to wasm using the MLC AI LLM library (q4f32_1). I've updated background/index.js to reflect my model location / information:
wasmUrl: "/models/Llama-2-7b-f32-q4f32_1/Llama-2-7b-f32-q4f32_1-webgpu.wasm",
cacheUrl: "https://huggingface.co/maryxxx/Llama-2-7b-f32_q4f32_1/resolve/main/params/",
tokenizerUrl: "/models/Llama-2-7b-f32-q4f32_1/tokenizer.model",
The extension loads the params from hugging face successfully. It registers the following error after loading, but still presents me with the prompt popup form:
Uncaught (in promise) Error: Cannot find function encoding
Then, when I submit a prompt, I receive the following error:
Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'generate')
It seems to me like something is off in the loading process, but I'm not sure where to start with debugging -- and if the problem is my wasm model or additional parameters I need to change in the codebase.
If possible, could you please provide an overview of how you prepared the example vicuna model (in case I am missing a step during compilation) and any additional hints re: other parameters that might need to be changed in the app?
Thank you!