Ouna-the-Dataweaver comments

Repositories
Issues
Comments

Results 4 comments of


                                            Ouna-the-Dataweaver

[V1] Feedback Thread

I'm either going insane, but with V1 qwen 8b instruct LLM just breaks in fp8 and around 25% of generations are just gibberish, with same running code and everything. Do...

[Core] generate from input embeds

I cloned this using `gh pr checkout 6869` on latest vllm and looks like there's a bug? Basically, inputs processing is broken. When I add `print(f'inputs {inputs}\n preprocessed_inputs {preprocessed_inputs} \n...

[Core] generate from input embeds

I know that this is still in the works, but I tried it before recent merges and after, and both times I got some errors of more or less the...

[Core] generate from input embeds

Oh, I found mistake I made. Basically, this MR expects embeds as tensor without batch, but transformers llm use batched input. ``` print(f'embeddings shape: {embeds.shape}') output = self.llama_model.generate( inputs_embeds=embeds,... )...