Inducing Artificial Uncertainty in Language Models

May 13, 20262605.13595

Sophia Hager, Simon Zeng, Nicholas Andrews

cs.CL

TLDR

A new method induces artificial uncertainty in language models on easy data, improving their calibration and uncertainty quantification on challenging tasks.

Key contributions

Introduces "inducing artificial uncertainty" to train LLM uncertainty quantification.
Investigates methods to create artificial uncertainty on trivially easy data.
Probes trained with artificial uncertainty outperform those without in recognizing real uncertainty.
Achieves notably higher calibration on hard data with minimal performance loss on easy data.

Why it matters

Language models need to reliably quantify uncertainty for safety-critical applications. This paper addresses the growing challenge of training uncertainty quantification methods as LLMs saturate datasets. By inducing artificial uncertainty on easy data, it offers a novel approach to improve model calibration and trustworthiness without requiring scarce challenging data.

Original Abstract

In safety-critical applications, language models should be able to characterize their uncertainty with meaningful probabilities. Many uncertainty quantification approaches require supervised data; however, finding suitable unseen challenging data is increasingly difficult for large language models trained on vast amounts of scraped data. If the model is consistently (and correctly) confident in its predictions, the uncertainty quantification method may consistently overestimate confidence on new and unfamiliar data. Finding data which exhibits enough uncertainty to train supervised uncertainty quantification methods for high-performance models may therefore be challenging, and will increase in difficulty as LLMs saturate datasets. To address this issue, we first introduce the problem of inducing artificial uncertainty in language models, then investigate methods of inducing artificial uncertainty on trivially easy data in the absence of challenging data at training time. We use probes trained to recognize artificial uncertainty on the original model, and find that these probes trained on artificial uncertainty outperform probes trained without artificial uncertainty in recognizing real uncertainty, achieving notably higher calibration on hard data with minimal loss of performance on easy data.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers