displayName
JailbreakBench
homepageUrl
https://jailbreakbench.github.io/
kind
model-only
targetsKind
ModelVersion
description
JailbreakBench (Chao et al., 2024) is an open-source benchmark and
leaderboard tracking adversarial jailbreak attacks against aligned
LLMs, with a curated set of 100 misuse-aligned behaviors and
standardized scoring of attack success rate.