bumblebee issues

Remap layer names

Ideally we should use any layer names we want and then have an explicit name/pattern mapping from hf/transformers names. This way we can keep the models consistent, and also share...

jonatanklosko

Scrub usages of anonymous functions in implementations

Corollary to Axon issue

seanmor5

Unify model inputs

Currently there are some inputs applicable to most most models (input embeds, head mask, position ids), but not all models accept them. We should add the missing inputs to make...

jonatanklosko

API for composing encoder-decoder from pretrained models

A wrapper model that uses arbitrary encoder and decoder models. See [this example](https://huggingface.co/docs/transformers/model_doc/encoder-decoder#transformers.EncoderDecoderModel.forward.example).

jonatanklosko

Support Text to Speech

13

Hello! As Speech to Text models such as Whisper are added having access to some of the impressive AI Text to Speech models would be a nice way to close...

zolrath

kind:feature

Support Mixtral

4

It seems that bumblebee is not capable of loading Mixtral-8x7B models (base or instruct). I've checked the files and it should be able to load the model (in theory) since...

WebCloud

Support LLaVA

10

I'm working on adding LLaVA to bumblebee as a learning exercise. I need some guidance on a few things: 1. From the official implementation of LLaVA as seen [here](https://github.com/haotian-liu/LLaVA/blob/main/llava/model/multimodal_encoder/clip_encoder.py) ,...

briankariuki

kind:feature

note:discussion

WIP: Constrained sampling based on EBNF grammars

Matches the LlamaCPP behavior. I finished the EBNF parser which encodes the grammar in the same way as the implementation from: https://github.com/huggingface/transformers/pull/27557 Unfortunately I think we may have to refactor...

seanmor5

Reduce StableDiffusion memory usage

14

A list of ideas to explore: * [x] Lazy transfers (so we don't load data into the GPU at once) * [x] FP16 on load * [x] FP16 policies on...

josevalim

kind:chore

Download sharded parameters in parallel

The main challenge is concurrent ASCII progress bars, perhaps we should show progress in a different way in that case. (Perhaps a single progress bar with accumulated info)

jonatanklosko

kind:chore

bumblebee
bumblebee copied to clipboard

Metadata

Remap layer names

Scrub usages of anonymous functions in implementations

Unify model inputs

API for composing encoder-decoder from pretrained models

Support Text to Speech

Support Mixtral

Support LLaVA

WIP: Constrained sampling based on EBNF grammars

Reduce StableDiffusion memory usage

Download sharded parameters in parallel

← Metadata

Owner

Metadata

bumblebee bumblebee copied to clipboard

Metadata

← Metadata

Owner

Metadata

bumblebee
bumblebee copied to clipboard