Rahul D Shetty comments

Results 9 comments of


                                            Rahul D Shetty

[Feature] Langchain compatability

langchain team has already built this integration: https://python.langchain.com/en/latest/modules/models/llms/integrations/huggingface_textgen_inference.html

Server error: cublasLt ran into an error!

This could be some issue related to bitsandbytes quantization. There is a whole thread and similar linked issues: https://github.com/TimDettmers/bitsandbytes/issues/538

WebAssembly and emscripten headers

I've tried the approach suggested by @lukestanley and @loretoparisi and got starcoder.cpp to run on browser. Published a demo project on this: https://github.com/rahuldshetty/starcoder.js I tried with [tiny_starcoder_py ](https://huggingface.co/bigcode/tiny_starcoder_py) model as...

Using a model of type RefinedWeb to instantiate a model of type .

I was getting similar issue then I rolled back the docker image to older version and the model started working. Image where its working: ghcr.io/huggingface/text-generation-inference@sha256:f4e09f01c1dd38bc2e9c9a66e9de1c2e3dc9912c2781440f7ac1eb70f6b1479e Model: tiiuae/falcon-7b-instruct NUM_SHARD: 1 No...

Whole project with Playground and local model use

Hello @gitknu, You can find the source code for the playground and other examples over here: https://github.com/rahuldshetty/ggml.js-examples You just need to provide the relative path to the model file in...

Phi 2 model

Unfortunately its not possible to run the original phi-2 model on the browser with llm.js mainly due to the memory limitation on the WASM engine.

What’s the limitations?

I haven't tested but it might be too buggy (and slow) to run more than 2GB models with llm.js. This directly leverages the CPU w/WASM engine on the browser without...

Distributed inference

Could you share more context on what do you mean when you say distributed? At the moment what LLM.js does is, it uses the WebAssembly VM running on the browser...

stable diffusion not work in WASM

@Joinhack , if you're using Emscripten then try adding these flags and values during compilation: `-s INITIAL_MEMORY=1000MB -s MAXIMUM_MEMORY=4GB -s STACK_SIZE=11524288 -s ALLOW_MEMORY_GROWTH` This should get around the memory limitations....