root@rtlm:~#

Red Team Language Model Arena

Adversary-grade AI, tested in the arena. Compare models head-to-head on real offensive security tasks. RTLM is purpose-built for red teamers. See how it stacks up.

Get Access View Leaderboard

Multi-Model Streaming

Query RTLM, GPT-4o, Claude, Gemini, and Grok simultaneously. All responses stream in real-time, side by side.

Community Ranked

Vote on the best responses. The leaderboard reflects real-world performance on red team tasks, not synthetic benchmarks.

Built for Practitioners

RTLM is purpose-built for offensive security. The model improves as the community contributes.