Haocheng Xi comments

Results 12 comments of


                                            Haocheng Xi

Hellow,auther. The Web link of "download dataset" is out of data. Could you updata it? Thanks!

You can try to use the WSL2, a linux subsystem work on windows computer :)

Accessing slices of a tensor

> There is a plan to support them, probably after triton-mlir is merged. As @Jokeren mentioned, we could probably get some slow support working pretty easily, but getting it right...

triton performs bad for int8 precision

On my 4090 your code gives CUBLAS: 137.680676, Triton: 205.643462. My triton is a nightly built version at the end of Nov.

Can't reproduce the results for GLUE CoLA

I am facing the similar problem: when I set num_gpu=2 and add gradient_accumulation_steps=4 (which makes the batch size still 32), the average of 5 random seeds on CoLA of roberta-large...

OSError: Consistency check failed

If you use ln -s /.cache /root/,cache since the space of /root/ is limited, you need to export HF_HOME=. This solves my problem.

Issue training on multiple nodes

I have exactly the same problem. 1 node works, but 2 node fails. I think this is a problem on huggingface side.

cannot replicate DPO results of zephyr

I met similar question with you: My model gives ########## First turn ########## score model turn zephyr-7b-dpo-full-self-ref 1 7.79375 zephyr-7b-dpo-full-self 1 7.43750 zephyr-7b-sft-full-self-ref 1 6.63125 zephyr-7b-sft-full-self 1 6.39375 ########## Second...

cannot replicate DPO results of zephyr

where the models ends with '-ref' is the official checkpoint from huggingface, and models ends with '-self' are my models when reproducing the experiment.

Zephyr-dpo-full Checkpoints perform poorly on TruthfulQA.

I further do evaluation on some other datasets: the alignment-handbook/zephyr-7b-dpo-full model still performs worse than HuggingFaceH4/zephyr-7b-beta. ![image](https://github.com/huggingface/alignment-handbook/assets/87399272/b030a1d8-e414-4b8e-95c5-e818229b4bfc)

Fix MP4 video output in `save_video_as_grid_and_mp4`

For scripts/sampling/simple_video_sample.py vscode also can not view the mp4 file. after downgrade to imageio[ffmpeg]==2.26.1 it works fine.