ArXiv TLDR

Conversations Risk Detection LLMs in Financial Agents via Multi-Stage Generative Rollout

🐦 Tweet
2604.09056

Xiaotong Jiang, Jun Wu

cs.CRcs.CE

TLDR

FinSec is a four-tier framework that significantly improves financial LLM dialogue security detection by handling complex multi-turn risks.

Key contributions

  • FinSec is a four-tier framework for structured, interpretable, end-to-end financial risk detection in LLM dialogues.
  • It incorporates suspicious behavior analysis, delayed risk inference, semantic security, and integrated decision-making.
  • Achieves 90.13% F1 score, improving baselines by 6-14 percentage points in overall detection.
  • Reduces unsafe output probability (ASR to 9.09%) while maintaining model utility with a 0.9098 composite score.

Why it matters

This paper addresses the critical need for robust security detection in financial LLM agents, where existing methods fall short. FinSec provides a specialized, multi-stage framework that significantly enhances safety and compliance without sacrificing utility, crucial for high-stakes financial applications.

Original Abstract

With the rapid adoption of large language models (LLMs) in financial service scenarios, dialogue security detection under high regulatory risk presents significant challenges. Existing methods mainly rely on single-dimensional semantic judgments or fixed rules, making them inadequate for handling multi-turn semantic evolution and complex regulatory clauses; moreover, they lack models specifically designed for financial security detection. To address these issues, this paper proposes FinSec, a four-tier security detection framework for financial agent. FinSec enables structured, interpretable, and end-to-end identification of actual financial risks, incorporating suspicious behavior pattern analysis, delayed risk and adversarial inference, semantic security analysis, and integrated risk-based decision-making. Notably, FinSec significantly enhances the robustness of high-risk dialogue detection while maintaining model utility. Experimental results demonstrate FinSec's leading performance. In terms of overall detection capability, FinSec achieves an F1 score of 90.13%, improving upon baseline models by 6--14 percentage points; its ASR is reduced to 9.09%, markedly lowering the probability of unsafe outputs; and the AUPRC increases to 0.9189 -- an approximate 9.7% gain over general frameworks. Additionally, in balancing utility and safety, FinSec obtains a composite score of 0.9098, delivering robust and efficient protection for financial agent dialogues.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.