Prism: Symbolic Superoptimization of Tensor Programs
Mengdi Wu, Xiaoyu Jiang, Oded Padon, Zhihao Jia
TLDR
Prism is the first symbolic superoptimizer for tensor programs, achieving significant speedups and faster optimization for ML workloads.
Key contributions
- Introduces Prism, the first symbolic superoptimizer for tensor programs, using a novel sGraph representation.
- Employs a two-level search strategy with symbolic reasoning for efficient pruning of suboptimal regions.
- Achieves up to 2.2x speedup over superoptimizers and 4.9x over compilers for LLM workloads.
- Reduces end-to-end optimization time by up to 3.4x, combining rigor with scalability.
Why it matters
This paper introduces a groundbreaking symbolic superoptimizer, Prism, that significantly enhances the efficiency of tensor program optimization. By bridging rigorous search with scalability, it delivers substantial speedups for modern ML workloads like LLMs. This advancement is crucial for developing faster and more efficient AI systems.
Original Abstract
This paper presents Prism, the first symbolic superoptimizer for tensor programs. The key idea is sGraph, a symbolic, hierarchical representation that compactly encodes large classes of tensor programs by symbolically representing some execution parameters. Prism organizes optimization as a two-level search: it constructs symbolic graphs that represent families of programs, and then instantiates them into concrete implementations. This formulation enables structured pruning of provably suboptimal regions of the search space using symbolic reasoning over operator semantics, algebraic identities, and hardware constraints. We develop techniques for efficient symbolic graph generation, equivalence verification via e-graph rewriting, and parameter instantiation through auto-tuning. Together, these components allow Prism to bridge the rigor of exhaustive search with the scalability required for modern ML workloads. Evaluation on five commonly used LLM workloads shows that Prism achieves up to $2.2\times$ speedup over best superoptimizers and $4.9\times$ over best compiler-based approaches, while reducing end-to-end optimization time by up to $3.4\times$.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.