Trust-SSL: Additive-Residual Selective Invariance for Robust Aerial Self-Supervised Learning
Wadii Boulila, Adel Ammar, Bilel Benjdira, Maha Driss
TLDR
Trust-SSL improves self-supervised learning robustness for aerial imagery by introducing a per-sample, per-factor trust weight and additive-residual objective.
Key contributions
- Trust-SSL enhances aerial SSL robustness via a novel per-sample, per-factor trust weight and additive-residual objective.
- Outperforms existing methods, achieving 90.20% accuracy on EuroSAT, AID, NWPU-RESISC45 benchmarks.
- Shows +19.9 points improvement on severe haze corruption and +1 to +3 points in OOD detection.
- Introduces an interpretable evidential variant using Dempster-Shafer fusion for conflict and ignorance signals.
Why it matters
Existing SSL methods struggle with degraded aerial images. This paper introduces Trust-SSL, a robust approach that significantly improves performance even under severe corruptions. It offers a concrete design principle for uncertainty-aware SSL, vital for reliable real-world aerial image analysis.
Original Abstract
Self-supervised learning (SSL) is a standard approach for representation learning in aerial imagery. Existing methods enforce invariance between augmented views, which works well when augmentations preserve semantic content. However, aerial images are frequently degraded by haze, motion blur, rain, and occlusion that remove critical evidence. Enforcing alignment between a clean and a severely degraded view can introduce spurious structure into the latent space. This study proposes a training strategy and architectural modification to enhance SSL robustness to such corruptions. It introduces a per-sample, per-factor trust weight into the alignment objective, combined with the base contrastive loss as an additive residual. A stop-gradient is applied to the trust weight instead of a multiplicative gate. While a multiplicative gate is a natural choice, experiments show it impairs the backbone, whereas our additive-residual approach improves it. Using a 200-epoch protocol on a 210,000-image corpus, the method achieves the highest mean linear-probe accuracy among six backbones on EuroSAT, AID, and NWPU-RESISC45 (90.20% compared to 88.46% for SimCLR and 89.82% for VICReg). It yields the largest improvements under severe information-erasing corruptions on EuroSAT (+19.9 points on haze at s=5 over SimCLR). The method also demonstrates consistent gains of +1 to +3 points in Mahalanobis AUROC on a zero-shot cross-domain stress test using BDD100K weather splits. Two ablations (scalar uncertainty and cosine gate) indicate the additive-residual formulation is the primary source of these improvements. An evidential variant using Dempster-Shafer fusion introduces interpretable signals of conflict and ignorance. These findings offer a concrete design principle for uncertainty-aware SSL. Code is publicly available at https://github.com/WadiiBoulila/trust-ssl.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.