ArXiv TLDR

Property-Level Reconstructability of Agent Decisions: An Anchor-Level Pilot Across Vendor SDK Adapter Regimes

🐦 Tweet
2605.12078

Oleg Solozobov

cs.SEcs.AI

TLDR

This paper pilots a method to assess the reconstructability of AI agent decisions across various vendor SDK regimes, finding significant variability.

Key contributions

  • Evaluates reconstructability of AI agent decisions across six vendor SDK regimes using a Decision Trace Reconstructor.
  • Classifies Decision Event Schema properties as fillable, partially fillable, unfillable, or opaque per regime.
  • Reveals significant variability in decision reconstructability, with strict-governance completeness from 42.9% to 85.7%.
  • Identifies a universal gap in reasoning trace reconstructability and several regime-specific gaps.

Why it matters

Reconstructing AI agent decisions post-failure is critical for accountability. This pilot reveals significant, varying gaps in reconstructability across vendor SDKs, highlighting the urgent need for standardized logging for governance.

Original Abstract

Agentic AI failures need post-hoc reconstruction: what the agent did, on whose authority, against which policy, and from what reasoning. Cross-regime feasibility remains unmeasured under one property-level schema. We apply the Decision Trace Reconstructor unmodified to pinned worked-example anchors from six public vendor SDK regimes spanning cloud-agent, observability, tool-use, telemetry, and protocol traces, plus two comparator columns. Each Decision Event Schema (DES) property is classified as fully fillable, partially fillable, structurally unfillable, or opaque. Per-property reconstructability of an agent decision already varies between regimes at this anchor scale. Strict-governance-completeness separates into three tiers ranging from 42.9% to 85.7%, yielding one regime-independent gap (reasoning trace), four regime-dependent gaps, and one Mixed property; the pilot is single-annotator, one anchor per cell, descriptive, with outputs checksum-verifiable from a deposited reproducibility package.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.