Yanan Xie issues

Results 5 issues of


                                            Yanan Xie

FasterTransformer support for storywriter

## 🚀 Feature Request The script located under scripts/inference for converting HF checkpoint to FT format doesn't work for MPT-7B-Storywriter because it has clip_qkv = 6 unlike other MPT-7B models...

enhancement

Any plan for supporting DPO?

## 🚀 Feature Request Support DPO (Direct Preference Optimization) loss and data loader. ## Motivation Many recent open LLMs have achieved promising results from using DPO instead of RL-style tuning...

enhancement

Per-stream processing

## 🚀 Feature Request When I use multiple Streams to create a StreamingDataset, I want to be able to use a different pre-processing function to process the data in each...

enhancement

Compatibility with transformers.Trainer

## 🚀 Feature Request Currently StreamingDataset handles distributed data parallel training by itself. This makes it incompatible with Trainers that handles data distribution, such as transformers.Trainer (which also distribute the...

enhancement

[Feature] Qwen2.5-VL SFT + DPO + PPO with Sequence Parallelism

### Feature request Adding support for training Qwen2.5-VL with long context window. ### Motivation / references Qwen2.5-VL is one of the best multimodal open-source models at this moment. The open...

Multimodal

Feature