Joshua Rosenkranz comments

Results 9 comments of


                                            Joshua Rosenkranz

Python lambda much slower than java class implementing same interface when calling from Java

Is there any way to speed that up (lambda x: x + 1), as the function being passed to java from python is stateless? Or, if not, is there a...

[WIP] MLPSpeculator speculative decoding support

> Do you have link to the actual models so we can try this out and see how is the performance ? > > I made a lot of comments...

[WIP] MLPSpeculator speculative decoding support

Closing as this has been merged via https://github.com/huggingface/text-generation-inference/pull/1865

[speculator training] Speculator training

> Hi! What's the status on this PR? I'd like to train a few speculator models, but I'm not sure how to get started, due to a lack of documentation......

[speculator training] Speculator training

> Is this expected to be merged soon? @philschmid We are expecting to have speculator training merged sometime in next 2 weeks.

[speculator training] Speculator training

@philschmid This has been finished and merged in #114. @philschmid The speculator training implementation is now available in main. Please let us know if you have any feedback or questions....

[speculator training] Speculator training

Closing in favor of #114

The model conversion to hf is broken with the latest Fused GatedLinearUnit Support in ibm-fms 0.0.6

> The latest changes in 0.0.6 [foundation-model-stack/foundation-model-stack@eccd602](https://github.com/foundation-model-stack/foundation-model-stack/commit/eccd6028cec75f84ce1834a3a18649f5d8fc0641) break the [model conversion code](https://github.com/foundation-model-stack/fms-fsdp/blob/starcoder/fms_to_hf.py) > > I am switching back to the foundation-model-stack commit d04def43e9eb8a4e0adf7285c59dd66274e1b724 that still works Yes, looks like this...

Enable HF PretrainedModel loading for speculative model training

> Couldn't follow the reset logic. Rest everything looks good! Resetting always occurs on prefill. Past_key_value_states=None on every prefill (stage 1 always has past_key_value_states=None, stage 2 sets past_key_value_states to None...