Maxim Kurkin

Results 4 issues of Maxim Kurkin

### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答? | Is there an...

### 🚀 The feature, motivation and pitch Often used in pretraining of LMs for stabilization, i.e. the recent [Chameleon](https://arxiv.org/abs/2405.09818) & [PaLM](https://www.jmlr.org/papers/v24/22-1144.html). ### Alternatives [flash-attn](https://github.com/Dao-AILab/flash-attention/blob/main/flash_attn/ops/triton/cross_entropy.py) has implementations of abovementioned features, however,...

feature

### The model to consider. https://huggingface.co/IDEA-Research/ChatRex-7B ### The closest model vllm already supports. It's LLaVA-like, but with more complicated architecture and more types of inputs supported ### What's your difficulty...

new-model
stale