Using Bigscience Bloom 176B or Bloomz 176B instead of GPT-J 6B
Would it be possible to take this software and substitute the Bigscience Bloom 176B or Bloomz 176B models, instead of the present GPT-J 6B model, as a simple drop-in in the code? If so, would running such a refinement be expected to take an equivalently large amount of time and/or amount of GPU resources? Thanks.
Unofficial comment - generally 'yes' but the real premise here is that you can achieve something near state-of-the-art performance of models of that size with a much smaller model. Using a 176B param model kind of defeats that purpose.
The difference between this repo and alpaca is that the model used is gpt-j instead of llama?
Yes at the moment that is the substantial difference.