Results 1 repositories owned by SORRY-Bench

sorry-bench

70
Stars
6
Forks
70
Watchers

Benchmark evaluation code for "SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal" (ICLR 2025)