ArXiv TLDR

Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles

🐦 Tweet
2604.28087

Zainab Rehan, Christian Medeiros Adriano, Sona Ghahremani, Holger Giese

cs.LOcs.AI

TLDR

This paper introduces a neuro-symbolic framework using LLMs to synthesize and verify causal rules from human goals, improving safety-critical systems.

Key contributions

  • Introduces a meta-level layer for neuro-symbolic causal rule synthesis and verification.
  • Employs LLMs to decompose natural language goals into formal first-order causal rules.
  • Features a Rule Verification Engine for syntax, consistency, and safety checks of derived rules.
  • Demonstrated successful rule derivation and formalization in autonomous driving scenarios.

Why it matters

This work addresses critical challenges in safety-critical rule-based systems, such as goal misspecification and scalability. By integrating LLMs and formal verification, it enables the synthesis of robust, explainable, and legally/safely compliant rules, reducing reward hacking.

Original Abstract

Rule-based systems remain central in safety-critical domains but often struggle with scalability, brittleness, and goal misspecification. These limitations can lead to reward hacking and failures in formal verification, as AI systems tend to optimize for narrow objectives. In previous research, we developed a neuro-symbolic causal framework that integrates first-order logic abduction trees, structural causal models, and deep reinforcement learning within a MAPE-K loop to provide explainable adaptations under distribution shifts. In this paper, we extend that framework by introducing a meta-level layer designed to mitigate goal misspecification and support scalable rule maintenance. This layer consists of a Goal/Rule Synthesizer and a Rule Verification Engine, which iteratively refine a formal rule theory from high-level natural-language goals and principles provided by human experts. The synthesis pipeline employs large language models (LLMs) to: (1) decompose goals into candidate causes, (2) consolidate semantics to remove redundancies, (3) translate them into candidate first-order rules, and (4) compose necessary and sufficient causal sets. The verification pipeline then performs (1) syntax and schema validation, (2) logical consistency analysis, and (3) safety and invariant checks before integrating verified rules into the knowledge base. We evaluated our approach with a proof-of-concept implementation in two autonomous driving scenarios. Results indicate that, given human-specified goals and principles, the pipeline can successfully derive minimal necessary and sufficient rule sets and formalize them as logical constraints. These findings suggest that the pipeline supports incremental, modular, and traceable rule synthesis grounded in established legal and safety principles.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.