Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy
Yinghao Chen, Yeying Jin, Xiang Chen, Yanyan Wei, Ziyang Yan + 1 more
TLDR
This paper introduces RGSUD, an unsupervised image deraining method that uses a reward-guided self-reinforcement strategy to leverage high-quality outputs during training.
Key contributions
- Introduces RGSUD, an unsupervised deraining method using reward-guided self-reinforcement.
- Proposes an IQA-based dynamic reward recycling to collect high-quality derained images.
- Incorporates rewards into optimization via self-reinforcement training for better output alignment.
- Achieves state-of-the-art performance across diverse paired and unpaired deraining datasets.
Why it matters
Unsupervised deraining struggles with diverse rain and weak constraints. This paper addresses these by introducing a novel reward-guided self-reinforcement strategy that leverages internal high-quality outputs. It offers a robust, adaptable framework, achieving SOTA results and demonstrating broad applicability.
Original Abstract
Unsupervised deraining has attracted attention for its ability to learn the real-world distribution of rain without paired supervision. However, the lack of strong constraints makes it difficult for the network to converge, especially with the complex diversity of rain degradation. A key motivation is that high-quality deraining results occasionally emerge during training, which can be leveraged to guide the optimization process. To overcome these challenges, we introduce RGSUD (Reward-Guided Self-Reinforcement Unsupervised Image Deraining), comprising two key stages: reward recycling and self-reinforcement (SR) training. For the former stage, we propose an Image Quality Assessment (IQA)-based dynamic reward recycling mechanism that selects optimal derained outputs during training and continuously collects high-quality deraining images. In latter stage, we incorporate these rewards into the model's optimization process, constraining the optimization space and improving alignment between derained outputs and clean images. By leveraging IQA-based self-reinforced loss and dynamically updated rewards, we enhance the quality of synthesized pseudo-paired data and stabilize the optimization. Extensive experiments demonstrate that our method achieves SOTA performance across multiple datasets, including paired synthetic, paired real, and unpaired real images, outperforming existing unsupervised deraining approaches in both subjective and objective IQA metrics. Additionally, we show that the self-reinforcement strategy is adaptable to other unsupervised deraining methods and our deraining framework demonstrates strong generalization across existing supervised deraining networks.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.