Cannot reproduce the results shown in the paper?
Describe the issue
Hi there. A nice work! When I tried to reproduce the result of STIC, I did not see the improvement of STIC-stage1-preference optimization. The training setting is the same as yours. I tried two different versions of LLAVA 1.6 (vicuna-7b and mistral-7b) on the Science QA test dataset. Here I report the results:
llava-mistral-7b on Science-QA test without STIC-stage1
llava-mistral-7b on Science-QA test with STIC-stage1 (use your provided weight of lora)
Here somehow, we saw the results but it is not consistent with the paper (approximately 60).
And I tried STIC-stage1 in the llava-vicuna-7b. Here I saw no improvement. we did not change the trainingdate and setting.
llava-vicuna-7b (original)
llava-vicuna-7b after STIC-stage1 (trained on 4 48G L20 in our envs)
Here I also share the training loss log and lora setting here. They look normal
How should I do to get improvement in preference optimization? It really helps me. Thank you
描述问题
你好。干得不错!当我尝试重现 STIC 的结果时,我没有看到 STIC-stage1-preference 优化的改进。训练设置与你的相同。我在 Science QA 测试数据集上尝试了两个不同版本的 LLAVA 1.6(vicuna-7b 和 mistral-7b)。我在这里报告结果:
llava-mistral-7b 在 Science-QA 测试中不带 STIC-stage1
llava-mistral-7b 在 Science-QA 测试中使用 STIC-stage1(使用您提供的 lora 权重)
在这里不知何故,我们看到了结果,但它与论文不一致(大约 60)。
我在 llava-vicuna-7b 中尝试了 STIC-stage1。在这里我没有看到任何改进。我们没有更改训练日期和设置。llava -vicuna-7b(原始) STIC-stage1 之后的 llava-vicuna-7b(在我们的环境中在 4 48G L20 上训练)
![]()
这里我也分享了训练损失日志和 lora 设置。它们看起来很正常
我该怎么做才能提高偏好优化?它真的帮助了我。谢谢
Hi, have you solved this?



