Ambrose Robinson
Ambrose Robinson
### Checklist - [X] I have searched the [existing issues](https://github.com/streamlit/streamlit/issues) for similar feature requests. - [X] I added a descriptive title and summary to this issue. ### Summary I love...
Hi I'm trying to decide between utilising LMQL or guidance for a project I'm working on (I'm sure you guys get this a lot) and it seems like LMQL is...
Is there any plans to integrate images input into LMQL? With the new GPT-4V and open-source lightweight vision language models such as [MPlug-Owl](https://github.com/X-PLUG/mPLUG-Owl) it would be incredibly useful. I work...
I'm using LMQL as the front-end for a big project that requires a lot of inference. I am using an A100 80GB but finding inference to be incredibly slow. I...
Here is my original comment of a recently closed (but unresolved) issue: >This issue is not completed! I still think this is very very necessary functionality that ONNX is completely...
### Describe the issue as clearly as possible: I'm a heavy user of `outlines.models.Transformers` and use the `stream` function after converting to a regex generator via `outlines.generate.regex`, however, when testing...
I made minor some adjustments to the code to try and quantize Minitron-4B-Base (nemotron architecture has no gate_proj in the MLP) but the resulting model is completely unusable. I think...
Not really an issue with AutoAWQ as it is with transformers but the `prepare_inputs_for_generation` functions are not being updated to include `position_embeddings` which models are needing for decoder layer inference...
I'm new to torchao and QAT but I'm pretty comfortable with PTQ techniques like AWQ and GPTQ. My deployment pipeline requires AWQ format (safetensors supported by autoawq or gptqmodel's new...