Fanrong Li

Results 4 issues of Fanrong Li

Waive twoshot to fix acc issue

- Add gpqa accuracy test script - Add gpqa accuracy tests - Update DeepSeek-v3 doc - Update qa test list

## Summary by CodeRabbit * **Tests** * Re-enabled previously skipped DeepSeekV32 test cases. * **Chores** * Updated accuracy reference values for DeepSeek-V3.2-Exp model. ✏️ Tip: You can customize this high-level...

## Summary by CodeRabbit * **Bug Fixes** * Fixed configuration handling for PyTorch backend to improve compatibility and correct behavior during evaluation. ✏️ Tip: You can customize this high-level summary...