[Question]: Fail to reproduce llmlingua on meetingbank

Open jzhang538 opened this issue 1 year ago • 1 comments

Describe the issue

Thanks for the interesting work. I tried to reproduce the results of llmlingua on the meetingbank QA dataset with Mistral-7B as the target LLM.

The small LLM I use is https://huggingface.co/NousResearch/Llama-2-7b-hf

However, the results seem much lower than the reported results in Table 4 of llmlingua2 (around 20 than 50.45 in the paper). Here is my implementation:

compressor = PromptCompressor( model_name=args.model_name, model_config={}, use_llmlingua2=False )

iterative_size = 200 comp_dict = compressor.compress_prompt( context=origin, instruction="", question="", rate=args.compression_rate, iterative_size=iterative_size, context_budget="*2.0",
)

I'm wondering if there is any issue with my implementation?

Jul 27 '24 03:07 jzhang538

Hi, @jzhang538, thank you for raising the question!

I think there are two reasons that may lead to this issue. The first is the parameters of LLMLingua, such as iterative_size or context_budget. The second is the evaluation. Note that we do not use the instruct version of Mistral in experiment, the model may generate lengthy responses and even raise similar questions in the response, which leads to a low performance. So it is necessary to truncate the responses at an appropriate place.

Jul 30 '24 08:07 pzs19