NAAMSE

Neural Adversarial Agent Mutation-based Security Evaluator

AI agents are increasingly deployed in production, yet security evaluations remain stuck in the past, relying on manual red-teaming and static benchmarks that cannot model adaptive, real-world adversaries. NAAMSE closes that gap by treating agent security as a feedback-driven optimization problem. Rather than running a fixed set of attacks, it evolves them, using each generation's results to compound pressure on your agent's weak points and surface jailbreaks, prompt injections, and PII leakage that one-shot methods routinely miss. Every run produces a comprehensive report with vulnerability analysis, attack effectiveness metrics, and cluster-based categorization of discovered exploits, giving your team a clear and actionable picture of where your agent breaks before it ever reaches production.

Get started

🥈

2nd Place in Agent Safety Track

Out of 1,500+ submissions at Berkeley RDI's AgentX - AgentBeats

NAAMSE demonstrated cutting-edge capabilities in agent security evaluation, earning recognition as one of the top solutions among thousands of competing teams.

Read Our Paper on arXiv View Competition Details View Certificate

Leaderboard

Comprehensive evaluation of AI models showing adversarial and benign scores across different test configurations

Hover over rows to see seed values

Adversarial Score (Lower = More Secure)

Benign Score (Lower = More Usable)

Average Score (Lower = Better Overall)

Rank	Model	Average Score	Adversarial Score	Benign Score