Cannot reproduce the results shown in the paper?

Open VoyageWang opened this issue 1 year ago • 1 comments

Describe the issue

Hi there. A nice work! When I tried to reproduce the result of STIC, I did not see the improvement of STIC-stage1-preference optimization. The training setting is the same as yours. I tried two different versions of LLAVA 1.6 (vicuna-7b and mistral-7b) on the Science QA test dataset. Here I report the results:

llava-mistral-7b on Science-QA test without STIC-stage1

llava-mistral-7b on Science-QA test with STIC-stage1 (use your provided weight of lora)

Here somehow, we saw the results but it is not consistent with the paper (approximately 60).

And I tried STIC-stage1 in the llava-vicuna-7b. Here I saw no improvement. we did not change the trainingdate and setting. llava-vicuna-7b (original) llava-vicuna-7b after STIC-stage1 (trained on 4 48G L20 in our envs)

Here I also share the training loss log and lora setting here. They look normal

How should I do to get improvement in preference optimization? It really helps me. Thank you

Jun 14 '24 07:06 VoyageWang

描述问题

你好。干得不错！当我尝试重现 STIC 的结果时，我没有看到 STIC-stage1-preference 优化的改进。训练设置与你的相同。我在 Science QA 测试数据集上尝试了两个不同版本的 LLAVA 1.6（vicuna-7b 和 mistral-7b）。我在这里报告结果：

llava-mistral-7b 在 Science-QA 测试中不带 STIC-stage1

llava-mistral-7b 在 Science-QA 测试中使用 STIC-stage1（使用您提供的 lora 权重）

在这里不知何故，我们看到了结果，但它与论文不一致（大约 60）。

我在 llava-vicuna-7b 中尝试了 STIC-stage1。在这里我没有看到任何改进。我们没有更改训练日期和设置。llava -vicuna-7b（原始） STIC-stage1 之后的 llava-vicuna-7b（在我们的环境中在 4 48G L20 上训练）

这里我也分享了训练损失日志和 lora 设置。它们看起来很正常

我该怎么做才能提高偏好优化？它真的帮助了我。谢谢

Hi, have you solved this?

Nov 25 '24 12:11 ZihaoZheng98