Yiran Zhang
3 papers ยท Latest:
Software Engineering
RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices
RealBench is a new benchmark for repo-level code generation, using structured designs (UML) to better align LLM evaluation with real-world software development.
2604.22659
Software EngineeringBridging the Gap between User Intent and LLM: A Requirement Alignment Approach for Code Generation
REA-Coder improves LLM code generation by iteratively aligning user requirements, addressing the common issue of LLMs misunderstanding prompts.
2604.16198
Cryptography & SecurityRLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience
RLSpoofer is a lightweight, black-box RL-based attack that exposes the fragility of LLM watermarking with minimal data, achieving high spoof success.
2604.11546
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.