Zhenxing Niu

3 papers · Latest: May 11, 2026

Re-Triggering Safeguards within LLMs for Jailbreak Detection

This paper introduces an embedding disruption method to re-trigger LLM safeguards, effectively detecting and defending against jailbreak attacks.

2605.10611May 11, 2026

Cryptography & Security

Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing

DR-Smoothing offers a guaranteed defense against LLM jailbreaking attacks by disrupting and rectifying prompts, balancing safety and helpfulness.

2605.10582May 11, 2026

Cryptography & Security

A Systematic Security Evaluation of OpenClaw and Its Variants

This paper systematically evaluates OpenClaw-series AI agents, revealing substantial security vulnerabilities beyond underlying models, emphasizing lifecycle-wide governance.

2604.03131Apr 3, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.