Trident: Improving Malware Detection with LLMs and Behavioral Features
Rebecca Saul, Jingzhi Jiang, Elliott Chia, David Wagner
TLDR
Trident uses LLMs and behavioral features to improve malware detection, making it more robust to concept drift than traditional static methods.
Key contributions
- LLMs efficiently process semi-structured sandbox behavior reports for malware detection.
- LLMs generate robust, behavior-based malware detection rules from small datasets.
- Trident combines static features, LLM-derived rules, and direct LLM analysis via majority voting.
- Trident outperforms static methods and offers concept drift resilience without retraining.
Why it matters
This paper introduces a novel approach to malware detection by leveraging LLMs to interpret dynamic behavioral reports. It significantly improves robustness against evolving threats (concept drift) compared to traditional static analysis. This makes malware detection more effective and adaptable.
Original Abstract
Traditionally, machine learning methods for PE malware detection have relied on static features like byte histograms, string information, and PE header contents. One barrier to incorporating dynamic analysis features has been the semi-structured nature of sandbox behavior reports. We show that, using the latest generation of large language models with reasoning, it is possible to efficiently process these behavior reports and utilize them as part of a malware detection pipeline. Specifically, we leverage LLMs to generate behavior-based malware detection rules based on a small training set of labeled malware. We find that these detection rules, derived from behavioral features, are much more robust to concept drift than standard static-feature methods, while maintaining practical false positive rates. Finally, we introduce Trident, a system which combines a classic decision tree model over static features, our behavior-based detection rules, and direct LLM analysis of sandbox reports through majority voting. Trident outperforms standard methods using static features, outperforms behavior-based rules alone, and is as resilient to concept drift as active learning methods without requiring retraining.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.