baseline score
Having a reference point for the baseline model's scores would be incredibly beneficial for my team and me as we develop our approach and compare its performance. If it's possible, could you please provide us with the baseline model's scores on the validation dataset, or direct us to where we could find this information?
Please see here
Thank you
Hi,
I've been looking at the language scores reported in your results and noticed that the baseline's language score is much lower compared to the GPT's high score. Additionally, the GPT scores for both sampled data and test data are almost identical, leading to very similar final scores.
Could you shed some light on the following?
Why is there such a big difference in language scores between the baseline and GPT? How can the GPT scores be so close for sampled and test data? It seems a bit odd, and I'm trying to make sense of it. Any clarification would be appreciated.
Thanks!