Making AI-Assisted Grant Evaluation Auditable without Exposing the Model

April 28, 20262604.25200

cs.CRcs.AIcs.CYcs.LG

TLDR

A TEE-based architecture enables auditable AI-assisted grant evaluation without exposing proprietary models or rubrics, ensuring accountability.

Key contributions

Proposes a TEE-based architecture for auditable AI grant evaluation without exposing proprietary details.
Enables external verification of model, rubric, and prompt used, without revealing proprietary weights or logic.
Generates an "attested evaluation bundle" as a signed, timestamped record of the evaluation process.
Incorporates a sanitization layer to mitigate prompt injection risks from applicant-controlled documents.

Why it matters

This paper addresses a critical governance challenge in deploying LLMs for sensitive public sector tasks like grant evaluation. It offers a practical, auditable solution that balances transparency with the need to protect proprietary models and prevent gaming. This design is crucial for building trust and accountability in AI-assisted decision-making.

Original Abstract

Public agencies are beginning to consider large language models (LLMs) as decision-support tools for grant evaluation. This creates a practical governance problem: the model and scoring rubric should not be exposed in a way that allows applicants to optimize against them, yet the evaluation process must remain auditable, contestable, and accountable. We propose a TEE-based architecture that helps reconcile these requirements through remote attestation. The architecture allows an external verifier to check which model, rubric, prompt template, and input representation were used, without exposing model weights, proprietary scoring logic, or intermediate reasoning to applicants or infrastructure operators. The main artifact is an attested evaluation bundle: a signed, timestamped record linking the original submission hash, the canonical input hash, the model-and-rubric measurement, and the evaluation output. The paper also considers a scenario-specific prompt injection risk: applicant-controlled documents may contain hidden or indirect instructions intended to influence the LLM evaluator. We therefore include a canonicalization and sanitization layer that normalizes document representations and records suspicious transformations before inference. We position the design relative to confidential AI inference, attestable AI audits, zero-knowledge machine learning, algorithmic accountability, and AI-assisted peer review. The resulting claim is deliberately narrow: remote attestation does not prove that an evaluation is fair or scientifically correct, but it can make part of the evaluation process externally verifiable.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers