Ying Zhang
6 papers ยท Latest:
Beyond "I cannot fulfill this request": Alleviating Rigid Rejection in LLMs via Label Enhancement
LANCE introduces a label enhancement method using variational inference to enable LLMs to provide safe yet flexible and natural responses, avoiding rigid rejections.
Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software
PoVSmith automates generating proof-of-vulnerability tests for software supply chain attacks using LLMs, significantly improving test quality and reducing manual effort.
Seek-and-Solve: Benchmarking MLLMs for Visual Clue-Driven Reasoning in Daily Scenarios
DailyClue is a new benchmark for MLLMs that evaluates their ability to perform visual clue-driven reasoning in complex, real-world daily scenarios.
Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems
A new supply-chain attack, DDIPE, poisons LLM coding agent skills by hiding malicious logic in documentation examples, bypassing strong defenses.
Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study
Study reveals widespread credential leakage in LLM agent skills, identifying 520 vulnerable skills and 10 leakage patterns, primarily via debug logs.
The Llama 3 Herd of Models
Llama 3 is a new family of large multilingual foundation models excelling in language, coding, reasoning, and multimodal tasks, rivaling GPT-4 in quality and offering extensive public releases.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.