ArXiv TLDR

Ying Zhang

6 papers ยท Latest:

Natural Language Processing

Beyond "I cannot fulfill this request": Alleviating Rigid Rejection in LLMs via Label Enhancement

LANCE introduces a label enhancement method using variational inference to enable LLMs to provide safe yet flexible and natural responses, avoiding rigid rejections.

2605.07883
Cryptography & Security

Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

PoVSmith automates generating proof-of-vulnerability tests for software supply chain attacks using LLMs, significantly improving test quality and reducing manual effort.

2605.03956
Computer Vision

Seek-and-Solve: Benchmarking MLLMs for Visual Clue-Driven Reasoning in Daily Scenarios

DailyClue is a new benchmark for MLLMs that evaluates their ability to perform visual clue-driven reasoning in complex, real-world daily scenarios.

2604.14041
Cryptography & Security

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

A new supply-chain attack, DDIPE, poisons LLM coding agent skills by hiding malicious logic in documentation examples, bypassing strong defenses.

2604.03081
Cryptography & Security

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

Study reveals widespread credential leakage in LLM agent skills, identifying 520 vulnerable skills and 10 leakage patterns, primarily via debug logs.

2604.03070
Artificial Intelligence

The Llama 3 Herd of Models

Llama 3 is a new family of large multilingual foundation models excelling in language, coding, reasoning, and multimodal tasks, rivaling GPT-4 in quality and offering extensive public releases.

2407.21783

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.