Zhen Xiang

2 papers · Latest: May 7, 2026

Crafting Reversible SFT Behaviors in Large Language Models

This paper introduces LCDD to create sparse, controllable "carriers" for SFT behaviors in LLMs, enabling their selective reversal with SFT-Eraser.

Green Shielding proposes a user-centric approach to build trustworthy AI by characterizing how benign input variations affect LLM behavior.

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.