ArXiv TLDR

RIRF: Reasoning Image Restoration Framework

🐦 Tweet
2604.09511

Wending Yan, Rongkai Zhang, Kaihua Tang, Yu Cheng, Qiankun Liu

cs.CV

TLDR

A novel framework, R&R, integrates Chain-of-Thought reasoning to diagnose image degradations and guide restoration, achieving state-of-the-art performance.

Key contributions

  • Introduces Reason and Restore (R&R), a framework integrating Chain-of-Thought reasoning for universal image restoration.
  • Employs a fine-tuned Qwen3-VL reasoner to diagnose degradation types, severity, and scene semantics.
  • Leverages quantified degradation severity as reinforcement learning signals to guide and strengthen the restorer.
  • Tightly couples semantic diagnostic reasoning with pixel-level restoration for improved interpretability and performance.

Why it matters

This paper introduces a novel framework that explicitly integrates diagnostic reasoning into universal image restoration. By using an LLM to understand degradation types and severity, R&R guides the restoration process more effectively. This leads to state-of-the-art performance and offers unique interpretability, advancing robust image recovery in diverse real-world scenarios.

Original Abstract

Universal image restoration (UIR) aims to recover clean images from diverse and unknown degradations using a unified model. Existing UIR methods primarily focus on pixel reconstruction and often lack explicit diagnostic reasoning over degradation composition, severity, and scene semantics prior to restoration. We propose Reason and Restore (R\&R), a novel framework that integrates structured Chain-of-Thought (CoT) reasoning into the image restoration pipeline. R\&R introduces an explicit reasoner, implemented by fine-tuning Qwen3-VL, to diagnose degradation types, quantify degradation severity, infer key degradation-related factors, and describe relevant scene and object semantics. The resulting structured reasoning provides interpretable and fine-grained diagnostic priors for the restorer. To further improve restoration quality, the quantified degradation severity produced by the reasoner is leveraged as reinforcement learning (RL) signals to guide and strengthen the restorer. Unlike existing multimodal LLM-based agentic systems that decouple reasoning from low-level vision tasks, R\&R tightly couples semantic diagnostic reasoning with pixel-level restoration in a unified framework. Extensive experiments across diverse UIR benchmarks demonstrate that R\&R achieves state-of-the-art performance while offering unique interpretability into the restoration process.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.