Engagement Phenotypes for a Sample of 102,684 AI Mental Health Chatbot Users and Dose-Response Associations with Clinical Outcomes

April 30, 20262605.00275

Emma C. Wolfe, Ting Su, Olivier Tieleman, Thomas D. Hull, Matteo Malgaroli + 1 more

cs.HC

TLDR

This study identifies five engagement phenotypes for an AI mental health chatbot and links them to clinical outcomes, showing dose-response effects.

Key contributions

Five distinct engagement phenotypes (e.g., Power Users, Early Dropouts) were identified among 102,684 chatbot users.
Significant improvements in depression, anxiety, and social support were observed after 3 weeks of use.
A dose-response relationship was found, with higher engagement linked to greater depression improvement.
Working alliance predicted depression improvement and moderated the engagement-social support relationship.

Why it matters

This paper provides crucial real-world insights into how users engage with AI mental health chatbots and the clinical benefits derived. It highlights that engagement is complex and not just about session counts, offering evidence for the value of these tools.

Original Abstract

Background: Conversational AI chatbots are emerging as scalable mental health tools, but little is known about real world engagement or its relationship to clinical outcomes. Objective: To characterize engagement phenotypes among users of Ash, a purpose-built AI mental health chatbot, and examine associations with clinical change and working alliance. Methods: K-means clustering across eight behavioral features identified engagement phenotypes among 102,684 users. Subsamples completed the PHQ-9 (n=298), GAD-7 (n=298), and MSPSS (social support; n=194) baseline and 3 weeks; 11,437 users completed baseline Working Alliance Inventory (WAI). Results: Five engagement phenotypes emerged: Early Dropouts (52.2%), Power Users (1.6%), Intensive Users (4.1%), Weekly Users (25.3%), and a novel Concentrated User pattern (16.8%); across users, 66.9% had at least one overnight session (9pm-5am). Significant pre-post improvements occurred in depression (d = -0.51), anxiety (d = -0.57), and social support (d = 0.22). An observed dose-response gradient in self-reported depression improvement was replicated in a larger sample with model-predicted PHQ-9 (n = 23,813; Power Users d = -0.54; Early Dropouts d = -0.13). Higher working alliance predicted depression improvement and moderated the engagement-social support relationship. Conclusions: Engagement with AI mental health tools is multidimensional, and different clinical outcomes respond to different dimensions of use. Findings caution against treating session counts as a primary engagement metric and offer naturalistic evidence for the clinical value of purpose-built conversational AI.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers