m-harmonic comments

Results 10 comments of


                                            m-harmonic

[BUG] ZeRO Stage2 and 3, error while loss backward

Hello, I am seeing the same error as others mentioned. I am using deepspeed_stage_3 with PyTorch Lightning, and all deepspeed settings are set to defaults: ``` trainer = lightning.Trainer( strategy...

Allow Stream's `repeat` option to cycle through entire dataset before repeating, when `shuffle=True`

@karan6181 Yes exactly, we do have cases where we have multiple streams some of which have multiple repeats. Separately we are also experiencing a problem that is forcing us to...

[V1] Feedback Thread

Does anyone know about this bug with n>1? Thanks https://github.com/vllm-project/vllm/issues/12584

[Bug]: Multistep with n>1 Fails

I'm running into the same issue. Does anyone know of a workaround? We don't need `best_of` or `use_beam_search` We can reproduce using VLLM's provided `benchmark_throughput.py`: This runs ok: ``` python...

[Bug]: Multistep with n>1 Fails

@comaniac Hi just wondering if someone working on VLLM can provide an update on this. We want to use multi-step scheduler because the throughput is much better for our needs,...

[Bug]: Multistep with n>1 Fails

> @afeldman-nm has a WIP branch for this Thanks — are you referring to the branch linked above that disables the multi-step scheduler?

[core] try to remove seq group from core

> > This modification makes the "fork" mechanism of vLLM completely unused. Previously, for a request with n > 1, its prompt was prefilled only once, and then the sequence...

[V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine)

> > It seems like this PR is implementing ideas similar to those implemented in PR #9302 for the V0 engine. That PR created some issues that were addressed in...

[V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine)

> @m-harmonic > > > Thanks for working on this. I haven't had a chance to look into the specifics of the new implementation but also wanted to ask about...

[V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine)

> > > @m-harmonic > > > > Thanks for working on this. I haven't had a chance to look into the specifics of the new implementation but also wanted...