Paolo Rechia
Paolo Rechia
@BIGPPWONG how much prompting did you do? I've just hacked a local setup for Vicuna-7B running on GPU (based on Hugging Face implementation, not llama.cpp) to work with Langchain ReAct...
@vowelparrot Yes, that’s it! I actually only noticed I could override the function after I opened this PR. So I think some additional documentation would be helpful (e.g., a notebook...
This PR was squashed today. Added a couple of modifications that made easier to integrate into guidance. Here's the PR in the guidance repository: https://github.com/microsoft/guidance/pull/298 I wouldn't be surprised if...
@tensiondriven Agreed, hopefully someone more knowledgeable than me will pick up the work so far and help completing the integration. Let’s see 🙂
Hey, @qeternity, my first attempt to support the guidance ‘select’ command was quite hacky. It uses a second trie instead of the inference logprobs, which is different from the transformers...
> I don't think Microsoft feedback is necessary, this will be incredibly useful to a lot of people even if it lives on as a fork. > > I think...
@fblissjr thanks for the interest! I believe @qeternity made some good progress in his fork. I’d recommend you read the discussion on https://github.com/microsoft/guidance/pull/298 to take some ideas out and understand...
@zmarty I can’t answer on it as I’m not a maintainer of this repo. Guidance integration ended up being harder than we expected, we all kinda of gave up for...
I see your problem, @zmarty I had not thought about your use case when I wrote this bias example, interesting. Unfortunately I’m also not familiar with how the exllama cache...
Stale PR, closing.