TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
Zerun Ma, Guoqiang Wang, Xinchen Xie, Yicheng Chen, He Du + 5 more
TLDR
TREX is a multi-agent system that automates the entire LLM fine-tuning lifecycle using a tree-based exploration for efficient strategy planning.
Key contributions
- TREX automates the entire LLM fine-tuning lifecycle using a multi-agent system.
- Models experiments as a search tree for efficient exploration and insight generation.
- Performs tasks from data research and strategy formulation to model training and evaluation.
- Introduces FT-Bench, a 10-task benchmark for evaluating automated LLM training.
Why it matters
Automating LLM training is a major challenge. TREX provides a comprehensive multi-agent system to streamline the entire fine-tuning lifecycle, from data research to evaluation. This significantly boosts efficiency and accessibility for LLM development.
Original Abstract
While Large Language Models (LLMs) have empowered AI research agents to perform isolated scientific tasks, automating complex, real-world workflows, such as LLM training, remains a significant challenge. In this paper, we introduce TREX, a multi-agent system that automates the entire LLM training life-cycle. By orchestrating collaboration between two core modules-the Researcher and the Executor-the system seamlessly performs requirement analysis, open-domain literature and data research, formulation of training strategies, preparation of data recipes, and model training and evaluation. The multi-round experimental process is modeled as a search tree, enabling the system to efficiently plan exploration paths, reuse historical results, and distill high-level insights from iterative trials. To evaluate the capability of automated LLM training, we construct FT-Bench, a benchmark comprising 10 tasks derived from real-world scenarios, ranging from optimizing fundamental model capabilities to enhancing performance on domain-specific tasks. Experimental results demonstrate that the TREX agent consistently optimizes model performance on target tasks.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.