TRACE: A Metrologically-Grounded Engineering Framework for Trustworthy Agentic AI Systems in Operationally Critical Domains

May 5, 20262605.03838

cs.CLcs.AIcs.HC

TLDR

TRACE is a new engineering framework for trustworthy agentic AI in critical domains, featuring a layered architecture, metrological trust metrics, and a parsimony principle.

Key contributions

Introduces TRACE, a four-layer engineering framework for trustworthy agentic AI in critical domains.
Features an explicit classical-ML vs. LLM-validator split (L2a/L2b) for deliberate LLM integration.
Incorporates a metrologically grounded trust-metric suite aligned with GUM/VIM/ISO 17025.
Presents the Computational Parsimony Ratio (CPR) as a new design principle for model parsimony.

Why it matters

This paper introduces a robust framework for building trustworthy AI agents in critical sectors. It provides a structured approach with measurable trust metrics and a principle for model parsimony. This ensures deliberate, quantifiable design decisions for AI, especially concerning LLMs.

Original Abstract

We introduce TRACE, a cross-domain engineering framework for trustworthy agentic AI in operationally critical domains. TRACE combines a four-layer reference architecture with an explicit classical-ML vs. LLM-validator split (L2a/L2b), a stateful orchestration-and-escalation policy (L3), and bounded human supervision (L4); a metrologically grounded trust-metric suite mapped to GUM/VIM/ISO 17025; and a Model-Parsimony principle quantified by the Computational Parsimony Ratio (CPR). Three instantiations--clinical decision support, industrial multi-domain operations, and a judicial AI assistant--transfer the samearchitecture and metrics across principally different governance contexts. The L2a/L2b separation makes the use of large language models a deliberate design decision rather than an architectural default, with parsimony quantified through CPR. TRACE introduces CPR as a first-class design principle in trustworthy-AI engineering.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers