Marko Tasic
Marko Tasic
Fix: - pex does not accept `extract-msg
Pre-compiled Python binding for llama.cpp using **cffi**. Supports CPU and CUDA 12.5 execution. Binary distribution which does not require complex compilation steps. Installation is as simple and fast as `pip...
### Project URL https://github.com/tangledgroup/llama-cpp-cffi ### Does this project already exist? - [X] Yes ### New Limit 200 MB ### Update issue title - [X] I have updated the title. ###...
According to bnb documentation here: https://huggingface.co/docs/bitsandbytes/main/optimizers https://huggingface.co/docs/bitsandbytes/main/explanations/optimizers#stable-embedding-layer This line could alter between bnb.nn.StableEmbedding and torch.nn.Embedding, or allow it to be configurable in config file: https://github.com/Lightning-AI/litgpt/blob/a8aa4bae5043b81b0b5e54bed838d1b57e1e1fe7/litgpt/model.py#L28 There are also other places...
Hi there, It would be great to have support for **GrokAdamW** optimizer but with low bit quantization. You can check reference implementation: https://github.com/cognitivecomputations/grokadamw It has shown promising results already.
Hi there, great work! Do you have plans for code datatset, and if yes when can we expect it?
## 🐍 Package Request - usearch 2.15.2 - https://github.com/unum-cloud/usearch - https://pypi.org/project/usearch/ - Package Dependencies that needs to be resolved first: pybind11, numpy ## Checklists - [x] I have tried to...
Is it possible to finetune (lora) model with raw LitData, like one used in pretraining? Main reason is that I want to perform "lightweight" continued pretraining on longer sequences but...
Hi, I pretrained Qwen 2.5 0.5B base model with single layer (on purpose), when I chat with model it "works." However when I try to evaluate model it fails: ```bash...