Wei-Lin Chiang

2 papers · Latest: April 20, 2026

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

ClawEnvKit automates diverse environment generation for claw-like agents from natural language, enabling scalable evaluation and adaptive training.

2604.18543Apr 20, 2026

Natural Language Processing

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

This paper demonstrates that strong large language models like GPT-4 can effectively serve as judges to evaluate other LLM-based chat assistants, closely matching human preferences on open-ended tasks.

2306.05685Jun 9, 2023

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.