StableLM
StableLM copied to clipboard
StableLM: Stability AI Language Models
Is it possible to have larger context as this allows to do more complicated things with smaller models? A lot of the negatives of a smaller model can be rectified...
Does using a diffusion model in a language model increase the generality of the language model?
It would be great to get the instructions to run the 3B model locally on a gaming GPU (e.g. 3090/4090 with 24GB VRAM). ### Confirmed GPUs From this thread |...
Hi, I want to fine-tune the 7b model, am I supposed to download the provided checkpoint and fine-tune it as shown in this repo: https://github.com/EleutherAI/gpt-neox#using-custom-data . Would they be compatible...
As seen [in this popular spreadsheet](https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYpb63e1ZR3aePczz3zlbJW-Y4) by @lhl , StableLM-Alpha-7B currently scores below 5 year old 1GB models with 700M parameters and well below its architectural cousin GPT-J-6B which is...
Thanks for your amazing work! We have simply extended StableLM for video question answering in our project [Ask-Anything](https://github.com/OpenGVLab/Ask-Anything/tree/main/video_chat_with_StableLM). In our attempts, it can generate longer content than chatGPT, but without...
Hi there! First of all, thank you for the amazing work! The readme says the models were trained on "the new dataset based on The Pile" which is 3x the...
Use a set for stop_ids instead of a full list search.