Fanrong Li
Fanrong Li
Waive twoshot to fix acc issue
- Add gpqa accuracy test script - Add gpqa accuracy tests - Update DeepSeek-v3 doc - Update qa test list
## Summary by CodeRabbit * **Tests** * Re-enabled previously skipped DeepSeekV32 test cases. * **Chores** * Updated accuracy reference values for DeepSeek-V3.2-Exp model. ✏️ Tip: You can customize this high-level...
## Summary by CodeRabbit * **Bug Fixes** * Fixed configuration handling for PyTorch backend to improve compatibility and correct behavior during evaluation. ✏️ Tip: You can customize this high-level summary...