ComplianceNLP: Knowledge-Graph-Augmented RAG for Multi-Framework Regulatory Gap Detection

April 26, 20262604.23585

cs.CLcs.IRcs.LG

TLDR

ComplianceNLP is a knowledge-graph-augmented RAG system for automated regulatory gap detection, outperforming GPT-4o and improving analyst efficiency.

Key contributions

Integrates a KG-augmented RAG pipeline grounded in a regulatory knowledge graph (SEC, MiFID II, Basel III).
Performs multi-task obligation extraction using NER, deontic classification, and cross-reference resolution.
Conducts compliance gap analysis by mapping obligations to internal policies with severity-aware scoring.
Achieves 87.7 F1 on gap detection, outperforming GPT-4o+RAG by +3.5 F1 in benchmarks.

Why it matters

Financial institutions face overwhelming regulatory changes, leading to massive fines. ComplianceNLP offers an automated solution to monitor regulations, extract obligations, and detect compliance gaps. This system significantly boosts analyst efficiency and accuracy, mitigating financial and reputational risks.

Original Abstract

Financial institutions must track over 60,000 regulatory events annually, overwhelming manual compliance teams; the industry has paid over USD 300 billion in fines and settlements since the 2008 financial crisis. We present ComplianceNLP, an end-to-end system that automatically monitors regulatory changes, extracts structured obligations, and identifies compliance gaps against institutional policies. The system integrates three components: (1) a knowledge-graph-augmented RAG pipeline grounding generations in a regulatory knowledge graph of 12,847 provisions across SEC, MiFID II, and Basel III; (2) multi-task obligation extraction combining NER, deontic classification, and cross-reference resolution over a shared LEGAL-BERT encoder; and (3) compliance gap analysis that maps obligations to internal policies with severity-aware scoring. On our benchmark, ComplianceNLP achieves 87.7 F1 on gap detection, outperforming GPT-4o+RAG by +3.5 F1, with 94.2% grounding accuracy ($r=0.83$ vs. human judgments) and 83.4 F1 under realistic end-to-end error propagation. Ablations show that knowledge-graph re-ranking contributes the largest marginal gain (+4.6 F1), confirming that structural regulatory knowledge is critical for cross-reference-heavy tasks. Domain-specific knowledge distillation (70B $\to$ 8B) combined with Medusa speculative decoding yields $2.8\times$ inference speedup; regulatory text's low entropy ($H=2.31$ bits vs. $3.87$ general text) produces 91.3% draft-token acceptance rates. In four months of parallel-run deployment processing 9,847 updates at a financial institution, the system achieved 96.0% estimated recall and 90.7% precision, with a $3.1\times$ sustained analyst efficiency gain. We report deployment lessons on trust calibration, GRC integration, and distributional shift monitoring for regulated-domain NLP.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers