MEMIT for llava
Hello, can I ask if MEMIT can be applied to the editing of multimodal models? For example, editing the LLAVA model through MEMIT? Do you have any plans to realize this function?
Hi there,
MEMIT can be applied to the editing of multimodal models. However, in our experience, it tends to show subpar editing performance. This is because MEMIT requires a last_subject_token in the triple (subject, relation, object) to edit, but VQA or caption data do not contain such triples (especially in caption data). As a workaround, we used the last_token instead, but this approach was unsuccessful.
If you have any suggestions or ideas on how to improve this, we welcome PRs to EasyEdit!
Hi, do you have any further questions?
@tbozhong 请问您使用的last_token是指image的最后一个token吗? 如果我想测试和实现您说的这种情况,我该怎么修改当前的MEMIT让他能够变成多模态的编辑呢? 我想测试一下您说的这个情况