Crashing Waves vs. Rising Tides: Preliminary Findings on AI Automation from Thousands of Worker Evaluations of Labor Market Tasks

April 1, 20262604.01363

Matthias Mertens, Adam Kuzee, Brittany S. Harris, Harry Lyu, Wensu Li + 4 more

cs.AIecon.GN

TLDR

This study finds AI automation is a "rising tide," showing continuous, broad-based improvement across thousands of labor tasks, not abrupt surges.

Key contributions

AI automation is primarily a "rising tide" of continuous, broad-based capability growth, not abrupt "crashing waves."
Evaluated 3,000+ text-based labor tasks using 17,000 worker evaluations to assess AI performance.
AI currently completes 3-4 hour human tasks with ~50% success, projected to ~65% by 2025-Q3.
Projects 80-95% success for most text-related tasks by 2029 if current AI capability trends continue.

Why it matters

This paper provides empirical evidence that AI's impact on labor markets will be a gradual, pervasive "rising tide." Its projections suggest significant AI capability in text tasks by 2029, informing future policy and economic planning.

Original Abstract

We propose that AI automation is a continuum between: (i) crashing waves where AI capabilities surge abruptly over small sets of tasks, and (ii) rising tides where the increase in AI capabilities is more continuous and broad-based. We test for these effects in preliminary evidence from an ongoing evaluation of AI capabilities across over 3,000 broad-based tasks derived from the U.S. Department of Labor O*NET categorization that are text-based and thus LLM-addressable. Based on more than 17,000 evaluations by workers from these jobs, we find little evidence of crashing waves (in contrast to recent work by METR), but substantial evidence that rising tides are the primary form of AI automation. AI performance is high and improving rapidly across a wide range of tasks. We estimate that, in 2024-Q2, AI models successfully complete tasks that take humans approximately 3-4 hours with about a 50% success rate, increasing to about 65% by 2025-Q3. If recent trends in AI capability growth persist, this pace of AI improvement implies that LLMs will be able to complete most text-related tasks with success rates of, on average, 80%-95% by 2029 at a minimally sufficient quality level. Achieving near-perfect success rates at this quality level or comparable success rates at superior quality would require several additional years. These AI capability improvements would impact the economy and labor market as organizations adopt AI, which could have a substantially longer timeline.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers