InCoder-32B-Thinking: Industrial Code World Model for Thinking

April 3, 20262604.03144

Jian Yang, Wei Zhang, Jiajun Wu, Junhang Cheng, Tuney Zheng + 20 more

cs.ARcs.AIcs.CL

TLDR

InCoder-32B-Thinking generates expert reasoning traces for industrial code by combining error-driven chain-of-thought with a hardware-aware world model.

Key contributions

Introduces InCoder-32B-Thinking, a model for generating expert reasoning traces in industrial code development.
Uses Error-driven Chain-of-Thought (ECoT) to synthesize thinking content from multi-turn dialogue with error feedback.
Employs an Industrial Code World Model (ICWM) trained on execution traces to predict hardware behavior and self-verify.
Achieves top-tier open-source performance on 14 general and 9 industrial coding benchmarks, including CAD-Coder.

Why it matters

Industrial software development for hardware (chips, GPUs) lacks expert reasoning data. This paper addresses this by creating a model that generates and validates such reasoning traces. This approach significantly improves performance on complex industrial coding tasks, bridging a critical gap in AI-assisted hardware design.

Original Abstract

Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason about hardware constraints and timing semantics. In this work, we propose InCoder-32B-Thinking, trained on the data from the Error-driven Chain-of-Thought (ECoT) synthesis framework with an industrial code world model (ICWM) to generate reasoning traces. Specifically, ECoT generates reasoning chains by synthesizing the thinking content from multi-turn dialogue with environmental error feedback, explicitly modeling the error-correction process. ICWM is trained on domain-specific execution traces from Verilog simulation, GPU profiling, etc., learns the causal dynamics of how code affects hardware behavior, and enables self-verification by predicting execution outcomes before actual compilation. All synthesized reasoning traces are validated through domain toolchains, creating training data matching the natural reasoning depth distribution of industrial tasks. Evaluation on 14 general (81.3% on LiveCodeBench v5) and 9 industrial benchmarks (84.0% in CAD-Coder and 38.0% on KernelBench) shows InCoder-32B-Thinking achieves top-tier open-source results across all domains.GPU Optimization

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers