m-harmonic
m-harmonic
Hello, I am seeing the same error as others mentioned. I am using deepspeed_stage_3 with PyTorch Lightning, and all deepspeed settings are set to defaults: ``` trainer = lightning.Trainer( strategy...
Allow Stream's `repeat` option to cycle through entire dataset before repeating, when `shuffle=True`
@karan6181 Yes exactly, we do have cases where we have multiple streams some of which have multiple repeats. Separately we are also experiencing a problem that is forcing us to...
Does anyone know about this bug with n>1? Thanks https://github.com/vllm-project/vllm/issues/12584
I'm running into the same issue. Does anyone know of a workaround? We don't need `best_of` or `use_beam_search` We can reproduce using VLLM's provided `benchmark_throughput.py`: This runs ok: ``` python...
@comaniac Hi just wondering if someone working on VLLM can provide an update on this. We want to use multi-step scheduler because the throughput is much better for our needs,...
> @afeldman-nm has a WIP branch for this Thanks — are you referring to the branch linked above that disables the multi-step scheduler?
> > This modification makes the "fork" mechanism of vLLM completely unused. Previously, for a request with n > 1, its prompt was prefilled only once, and then the sequence...
> > It seems like this PR is implementing ideas similar to those implemented in PR #9302 for the V0 engine. That PR created some issues that were addressed in...
> @m-harmonic > > > Thanks for working on this. I haven't had a chance to look into the specifics of the new implementation but also wanted to ask about...
> > > @m-harmonic > > > > Thanks for working on this. I haven't had a chance to look into the specifics of the new implementation but also wanted...