ArXiv TLDR

DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation

🐦 Tweet
2604.09089

Li Huang, Zhongxin Liu, Yifan Wu, Tao Yin, Dong Li + 4 more

cs.SEcs.AIcs.CR

TLDR

DeepGuard enhances secure code generation by aggregating multi-layer semantic cues, improving security by 11.9% while maintaining correctness.

Key contributions

  • Diagnoses that vulnerability signals are strongest in intermediate-to-upper LLM layers, not just the final layer.
  • Proposes DeepGuard, which aggregates multi-layer representations via attention for a dedicated security analyzer.
  • Uses a multi-objective training approach to balance security and functional correctness.
  • Improves secure-and-correct code generation by 11.9% across five LLMs, generalizing to new vulnerabilities.

Why it matters

LLMs frequently generate insecure code, and existing security methods often miss distributed vulnerability cues. DeepGuard solves this by aggregating multi-layer semantic signals, boosting code security without sacrificing functional correctness. This is crucial for deploying safer, more reliable AI-generated code.

Original Abstract

Large Language Models (LLMs) for code generation can replicate insecure patterns from their training data. To mitigate this, a common strategy for security hardening is to fine-tune models using supervision derived from the final transformer layer. However, this design may suffer from a final-layer bottleneck: vulnerability-discriminative cues can be distributed across layers and become less detectable near the output representations optimized for next-token prediction. To diagnose this issue, we perform layer-wise linear probing. We observe that vulnerability-related signals are most detectable in a band of intermediate-to-upper layers yet attenuate toward the final layers. Motivated by this observation, we introduce DeepGuard, a framework that leverages distributed security-relevant cues by aggregating representations from multiple upper layers via an attention-based module. The aggregated signal powers a dedicated security analyzer within a multi-objective training objective that balances security enhancement and functional correctness, and further supports a lightweight inference-time steering strategy. Extensive experiments across five code LLMs demonstrate that DeepGuard improves the secure-and-correct generation rate by an average of 11.9% over strong baselines such as SVEN. It also preserves functional correctness while exhibiting generalization to held-out vulnerability types. Our code is public at https://github.com/unknownhl/DeepGuard.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.