David Lo
6 papers ยท Latest:
Tail-aware N-version Machine Learning Models for Reliable API Recommendation
NvRec uses N-version ML models to improve the reliability of API recommendations, especially for infrequently used "tail" APIs, by filtering unreliable outputs.
TitanCA: Lessons from Orchestrating LLM Agents to Discover 100+ CVEs
TitanCA orchestrates LLM agents to discover 203 zero-day vulnerabilities and 118 CVEs, significantly improving software security.
Can LLMs Deobfuscate Binary Code? A Systematic Analysis of Large Language Models into Pseudocode Deobfuscation
LLMs can deobfuscate binary code, but performance relies on reasoning and task-specific fine-tuning, not just model size.
Evaluating LLM-Based 0-to-1 Software Generation in End-to-End CLI Tool Scenarios
This paper introduces CLI-Tool-Bench, a new benchmark for evaluating LLM-based 0-to-1 software generation, revealing current models struggle with end-to-end CLI tool creation.
AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits
AgentSZZ is an LLM agent framework that significantly improves bug-inducing commit identification, especially for complex cases like cross-file and ghost commits.
TestDecision: Sequential Test Suite Generation via Greedy Optimization and Reinforcement Learning
TestDecision uses greedy optimization and RL to enable open-source LLMs to generate high-quality, sequential test suites, boosting coverage and bug detection.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.