Black-Box Optimization of Mixed Binary-Continuous Variables: Challenges and Opportunities in Evolutionary Model Merging

May 12, 20262605.12326

cs.NE

TLDR

This paper surveys evolutionary model merging and formally characterizes data flow space merging as a complex black-box optimization problem.

Key contributions

Surveys evolutionary model merging, categorizing techniques into parameter-space, data flow space, and hybrid.
Formally defines data flow space (DFS) merging as a black-box optimization problem with mixed binary-continuous variables.
Highlights challenges in DFS merging: high-dimensional search spaces and conditional dependencies between variables.
Demonstrates a structured approach to DFS merging improves accuracy by 6.7% and reduces search space by 51.4%.

Why it matters

This paper clarifies the complex optimization challenges in data flow space model merging, a cost-effective alternative to training large language models. By formally characterizing the problem and demonstrating a more effective structured approach, it bridges communities and opens new research avenues for more efficient LLM development.

Original Abstract

Model merging has emerged as a cost-effective alternative to training large language models (LLMs) from scratch, enabling researchers to combine pre-trained models into more capable systems without full retraining. Evolutionary approaches to model merging have shown particular promise, automatically searching for optimal merging configurations across both parameter space (PS) and data flow space (DFS). However, the optimization challenges underlying these approaches -- particularly in DFS merging -- remain poorly understood and formally underspecified in the literature. This paper makes two contributions. First, we provide a structured survey of evolutionary model merging techniques, organizing them into three categories: parameter-space merging, data flow space merging, and hybrid approaches. Second, we formally characterize the DFS merging problem as a black-box optimization problem involving mixed binary-continuous variables, high-dimensional search spaces, and conditional dependencies between variable types -- challenges that standard optimization methods such as CMA-ES are not designed to handle. We provide preliminary empirical validation using real pre-trained language models, demonstrating that a structured approach respecting the binary-continuous conditional dependency outperforms an unstructured approach by 6.7% accuracy while reducing the effective search space by 51.4%. By connecting the model merging community with the broader evolutionary computation and black-box optimization literature, we identify concrete open problems and propose research directions to address them.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers