Zhenyu Chen

3 papers · Latest: May 7, 2026

Breaking, Stale, or Missing? Benchmarking Coding Agents on Project-Level Test Evolution

TEBench is the first project-level benchmark for evaluating coding agents on test evolution, revealing limitations in handling stale and missing tests.

2605.06125May 7, 2026

Cryptography & Security

Train in Vain: Functionality-Preserving Poisoning to Prevent Unauthorized Use of Code Datasets

FunPoison introduces a functionality-preserving poisoning method to prevent unauthorized use of code datasets for training CodeLLMs, maintaining compilability.

2604.22291Apr 24, 2026

Software Engineering

Log-based, Business-aware REST API Testing

LoBREST is a log-based, business-aware REST API testing technique that uses historical request logs to test complex functionalities.

2604.08007Apr 9, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.