TL-RL-FusionNet: An Adaptive and Efficient Reinforcement Learning-Driven Transfer Learning Framework for Detecting Evolving Ransomware Threats

April 22, 20262604.20260

Jannatul Ferdous, Rafiqul Islam, Arash Mahboubi, Md Zahidul Islam

cs.CR

TLDR

TL-RL-FusionNet uses RL-guided transfer learning to adaptively detect evolving ransomware threats with high accuracy and efficiency.

Key contributions

Proposes TL-RL-FusionNet, a hybrid framework combining frozen TL backbones with an RL-guided MLP classifier.
An RL agent adaptively reweights training samples, prioritizing complex ransomware and down-weighting trivial cases.
Achieves 99.1% accuracy and 99.6% recall, outperforming non-RL baselines by up to 3.1% in recall.
Demonstrates high efficiency with 55% lower training time and 59% reduced RAM usage compared to baselines.

Why it matters

Evolving ransomware poses a significant challenge to traditional detection methods. This paper introduces an adaptive framework that leverages reinforcement learning to dynamically adjust to new threats. Its high accuracy and efficiency make it a promising solution for real-world cybersecurity.

Original Abstract

Modern ransomware exhibits polymorphic and evasive behaviors by frequently modifying execution patterns to evade detection. This dynamic nature disrupts feature spaces and limits the effectiveness of static or predefined models. To address this challenge, we propose TL-RL-FusionNet, a reinforcement learning (RL)-guided hybrid framework that integrates frozen dual transfer learning (TL) backbones as feature extractors with a lightweight residual multilayer perceptron (MLP) classifier. The RL agent supervises training by adaptively reweighting samples in response to variations in observable ransomware behavior. Through reward and penalty signals, the agent prioritizes complex cases such as stealthy or polymorphic ransomware employing obfuscation, while down-weighting trivial samples including benign applications with simple file I/O operations or easily classified ransomware. This adaptive mechanism enables the model to dynamically refine its strategy, improving resilience against evolving threats while maintaining strong classification performance. The framework utilizes dynamic behavioral features such as file system activity, registry changes, network traffic, API calls, and anti-analysis checks, extracted from sandbox-generated JSON reports. These features are transformed into RGB images and processed using frozen EfficientNetB0 and InceptionV3 models to capture rich feature representations efficiently. Final classification is performed by a lightweight residual MLP guided by an RL (Q-learning) agent. Experiments on a balanced dataset of 1,000 samples (500 ransomware, 500 benign) show that TL-RL-FusionNet achieves 99.1% accuracy, 98.6% precision, 99.6% recall, and 99.74% AUC, outperforming non-RL baselines by up to 2.5% in accuracy and 3.1% in recall. Efficiency analysis shows 55% lower training time and 59% reduced RAM usage, demonstrating suitability for real-world deployment.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers