Jan Ploski
Jan Ploski
> There is a check to stop processing if less than 1s of audio remains: > > https://github.com/ggerganov/whisper.cpp/blob/a750868428868abd437e228ae5cab763ef3dc387/whisper.cpp#L5271-L5277 > > I've figured it helps in most situations, but obviously can...
> For everyone's convenience, I've uploaded **llama models converted with the latest transformer git head** here: > > **7B** - https://huggingface.co/yahma/llama-7b-hf **13B** - https://huggingface.co/yahma/llama-13b-hf Unfortunately, unlike the decapoda-research/llama-7b-hf model the...
> > > For everyone's convenience, I've uploaded **llama models converted with the latest transformer git head** here: > > > **7B** - https://huggingface.co/yahma/llama-7b-hf **13B** - https://huggingface.co/yahma/llama-13b-hf > > >...
> So to sum it up, it would be nice to have a test configuration which can execute in the free Google Colab notebook - which I know is technically...
Try passing in n_ctx=2048 as parameter.
> > `llm_tokenizer_bpe::tokenize` seems to be subtly broken > > I implemented an independent port of the [gpt2-tokenizer](https://github.com/openai/gpt-2/blob/master/src/encoder.py#L55-L101)(will share the code if someone is interested) and it shows the same...
> I could imagine this to be hairy problem, because I'd assume a couple of models have been trained with the fast tokenizers? Yes, I suppose everyone uses the fast...
Note: to be able to use this test script (or alpaca-lora training really) in free version of Google Colab, I had to split the yahma/llama-7b-hf into more shards because of...
I should add that this is a very worthwhile PR and something like this should be referred to from front page documentation. I wasted a LOT of time running into...
Thanks for getting back and for naming the customizations in particular. I started an experimental branch (https://github.com/jploski/ggml/tree/mpt-experiment/examples/mpt) and figured out that ALiBi would be needed in place of LLaMA's RoPE...