Shengyuan Liu
2 papers ยท Latest:
Software Engineering
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows
Claw-Eval-Live is a live benchmark for LLM agents, evaluating their performance on evolving real-world workflows with verifiable execution.
2604.28139
Computer VisionNeuroClaw Technical Report
NeuroClaw is a multi-agent AI system designed to make neuroimaging research more executable and reproducible by handling diverse data and complex pipelines.
2604.24696
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.