Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity

May 8, 20262605.07097

Anastasis Kratsios, Gregory Cousins, Haitz Sáez de Ocáriz Borde, Bum Jun Kim, Simone Brugiapaglia

stat.MLcs.LGcs.NEmath.LOmath.ST

TLDR

Feedforward neural networks definable in o-minimal structures, including MLPs, CNNs, and transformers, possess finite PAC sample complexity.

Key contributions

Shows a broad class of feedforward NNs (o-minimal definable) have finite PAC sample complexity.
Covers MLPs, CNNs, GNNs, and transformers with common layers and operations.
Attributes learnability to 'tame feedforward computation,' not specific activations or VC bounds.
Suggests PAC learnability is a baseline, shifting focus to inductive biases and optimization.

Why it matters

This paper demonstrates that many modern feedforward neural networks inherently possess finite sample complexity, a fundamental property for reliable learning. It redefines PAC learnability as a baseline, encouraging researchers to focus on architectural inductive biases and optimization rather than just learnability itself.

Original Abstract

We show that, in a precise sense, a broad class of feedforward neural networks learn (have finite sample complexity) in the PAC model: every fixed finite feedforward architecture whose layers are definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting, even with unbounded parameters. This covers standard fixed-size MLPs, CNNs, GNNs, and transformers with fixed sequence length, together with the operations and layers typically used in such architectures, including linear projections, residual connections, attention mechanisms, pooling layers, normalization layers, and admissible positional encodings. Hence, distribution-free learnability for modern non-recurrent architectures is not an exceptional property of particular activations or architecture-specific VC arguments, but a consequence of tame feedforward computation. Our results reposition finite-sample PAC learnability as a baseline rather than a differentiator: they shift the focus of architectural comparison toward inductive biases, symmetries and geometric priors, scalability, and optimization behaviour.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers