Zeming Wei

2 papers · Latest: May 12, 2026

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

SkillSafetyBench evaluates how reusable skills in LLM agents create new attack surfaces, revealing vulnerabilities beyond model-level alignment.

2605.12015May 12, 2026

Cryptography & Security

The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems

This paper introduces Salami Slicing, a novel multi-turn jailbreak attack that exploits cumulative low-risk inputs to bypass LLM safety, achieving high success rates.

2604.11309Apr 13, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.