LongBench
LongBench copied to clipboard
What are the accurate scores of the task-level Radar graph

Could please share the accurate scores (before normalization) for the radar graph in the leaderboard? This will help people to compare the task-level performance with these models.
Thanks