safety-evaluation topic

List safety-evaluation repositories

sorry-bench

70
Stars
6
Forks
70
Watchers

Benchmark evaluation code for "SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal" (ICLR 2025)

Learn How To Observe, Manage, and Scale, Agentic AI Apps Using Azure AI Foundry - with this hands-on workshop