Auditing and Controlling AI Agent Actions in Spreadsheets
Sadra Sabouri, Zeinabsadat Saghi, Run Huang, Sujay Maladi, Esmeralda Eufracio + 2 more
TLDR
Pista is a spreadsheet AI agent that provides users with auditable, controllable actions, enabling real-time oversight and intervention during task execution.
Key contributions
- Introduces Pista, an AI agent designed for spreadsheet environments.
- Decomposes AI agent execution into auditable and controllable actions.
- Enables users to intervene and redirect the agent's decision-making at each step.
- Improves user task comprehension, perception of the agent, and sense of co-ownership.
Why it matters
Current AI agents lack transparency, hindering users' ability to oversee complex, multi-step tasks. Pista addresses this by enabling active participation in agent execution, allowing users to detect errors and align agent actions with their intent. This shifts AI oversight from passive post-hoc review to meaningful, real-time collaboration.
Original Abstract
Advances in AI agent capabilities have outpaced users' ability to meaningfully oversee their execution. AI agents can perform sophisticated, multi-step knowledge work autonomously from start to finish, yet this process remains effectively inaccessible during execution, often buried within large volumes of intermediate reasoning and outputs: by the time users receive the output, all underlying decisions have already been made without their involvement. This lack of transparency leaves users unable to examine the agent's assumptions, identify errors before they propagate, or redirect execution when it deviates from their intent. The stakes are particularly high in spreadsheet environments, where process and artifact are inseparable. Each decision the agent makes is recorded directly in cells that belong to and reflect on the user. We introduce Pista, a spreadsheet AI agent that decomposes execution into auditable, controllable actions, providing users with visibility into the agent's decision-making process and the capacity to intervene at each step. A formative study (N = 8) and a within-subjects summative evaluation (N = 16) comparing Pista to a baseline agent demonstrated that active participation in execution influenced not only task outcomes but also users' comprehension of the task, their perception of the agent, and their sense of role within the workflow. Users identified their own intent reflected in the agent's actions, detected errors that post-hoc review would have failed to surface, and reported a sense of co-ownership over the resulting output. These findings indicate that meaningful human oversight of AI agents in knowledge work requires not improved post-hoc review mechanisms, but active participation in decisions as they are made.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.