Open-Assistant Few-shot prefix a LLM to prompt it into an instruction following mode

One of the baselines presented in the InstructGPT paper is a "properly" prompted GPT-3 model (see footnote 6 and section 3.5). Before the specified instruction/prompt by the user, a specified prefix is prepended to the user-specified instructions.

Requirements:

A base model (eventually the same as the actual model we are going to be using).
A trained reward model.

After some fine-tuning on the prefixes, compare the base model and the propmpted model based on the reward attained by the reward model.

Jan 03 '23 21:01 sanagno

Hi I am a ML student at Copenhagen University and it will be a good opportunity for me too try some of the theory that I have learned. Particularly in upcoming and temporary research obtained in my time in University. BH Abubakar

Jan 21 '23 17:01 Abubakar115e

Feel free to join the discord and contact me from there :)

Jan 21 '23 19:01 sanagno

Sadly my GPU is not powerful enough, but this python script runs and should work as a backbone for the two requirements; model-vs-reward.txt

Feb 08 '23 20:02 Abubakar115e

Thanks for the effort @Abubakar115e. The idea of this issue is to experiment with different prefixes. If you cannot run these models it would be difficult to proceed. Perhaps you can try using some quantization tricks. You should be able to run decent models for inference even on a regular GPU.

Feb 11 '23 12:02 sanagno

Yes I know that you can and use both the GPU (RTX 2070 max q) and RAM, but I only have one laptop and this work could be anywhere from a few hours to several days. I have corrected the code and should experiment with different prefixes and I have tried for 10 hours and it was still not done on my system. model-vs-reward.txt

Feb 11 '23 21:02 Abubakar115e

Hi, I should be able to help with a AWS EC2 G5 instance with 24GB RAM. Is that enough? @sanagno How could I find you on Discord?

Feb 27 '23 01:02 wangrui6

That's me: Sotiris#3996 :)

Feb 27 '23 08:02 sanagno