On the Generalization Bounds of Symbolic Regression with Genetic Programming
Masahiro Nomura, Ryoki Hamano, Isao Ono
TLDR
This paper derives a generalization bound for Symbolic Regression with Genetic Programming, explaining how structural and constant-fitting complexities impact performance.
Key contributions
- Derived a generalization bound for GP-based Symbolic Regression, considering tree size, depth, and constants.
- Decomposes the generalization gap into structure-selection and constant-fitting complexity terms.
- Links practical GP techniques (e.g., parsimony pressure) to explicit complexity reductions.
- Offers a principled explanation for commonly observed empirical behaviors in GP-based SR.
Why it matters
Understanding why GP-based SR generalizes is crucial for its broader application. This paper provides a much-needed theoretical foundation, explaining how common practices improve generalization. It helps practitioners make more informed design choices and develop more robust SR systems.
Original Abstract
Symbolic regression (SR) with genetic programming (GP) aims to discover interpretable mathematical expressions directly from data. Despite its strong empirical success, the theoretical understanding of why GP-based SR generalizes beyond the training data remains limited. In this work, we provide a learning-theoretic analysis of SR models represented as expression trees. We derive a generalization bound for GP-style SR under constraints on tree size, depth, and learnable constants. Our result decomposes the generalization gap into two interpretable components: a structure-selection term, reflecting the combinatorial complexity of choosing an expression-tree structure, and a constant-fitting term, capturing the complexity of optimizing numerical constants within a fixed structure. This decomposition provides a theoretical perspective on several widely used practices in GP, including parsimony pressure, depth limits, numerically stable operators, and interval arithmetic. In particular, our analysis shows how structural restrictions reduce hypothesis-class growth while stability mechanisms control the sensitivity of predictions to parameter perturbations. By linking these practical design choices to explicit complexity terms in the generalization bound, our work offers a principled explanation for commonly observed empirical behaviors in GP-based SR and contributes towards a more rigorous understanding of its generalization properties.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.