Stable Behavior, Limited Variation: Persona Validity in LLM Agents for Urban Sentiment Perception
Neemias B da Silva, Rodrigo Minetto, Daniel Silver, Thiago H Silva
TLDR
LLM personas show stable behavior but limited diversity in urban sentiment perception, often performing similarly to models without personas.
Key contributions
- LLM agents with distinct personas show stable, reproducible urban sentiment judgments.
- Persona-induced variation is limited; gender and political orientation have negligible impact.
- Agents exhibit an extremity bias, struggling with fine-grained sentiment categories.
- Simple label-based persona prompting offers limited value, often matched by no-persona models.
Why it matters
This paper challenges the assumption that simple persona prompting significantly enhances LLM diversity for urban sentiment analysis. It highlights that while LLMs are consistent within personas, their ability to capture fine-grained human perception and differentiate across personas is limited. This suggests a need for more sophisticated persona integration methods.
Original Abstract
Large Language Models (LLMs) are increasingly used as proxies for human perception in urban analysis, yet it remains unclear whether persona prompting produces meaningful and reproducible behavioral diversity. We investigate whether distinct personas influence urban sentiment judgments generated by multimodal LLMs. Using a factorial set of personas spanning gender, economic status, political orientation, and personality, we instantiate multiple agents per persona to evaluate urban scene images from the PerceptSent dataset and assess both within-persona consistency and cross-persona variation. Results show strong convergence among agents sharing a persona, indicating stable and reproducible behavior. However, cross-persona differentiation is limited: economic status and personality induce statistically detectable but practically modest variation, while gender shows no measurable effect and political orientation only negligible impact. Agents also exhibit an extremity bias, collapsing intermediate sentiment categories common in human annotations. As a result, performance remains strong on coarse-grained polarity tasks but degrades as sentiment resolution increases, suggesting that simple label-based persona prompting does not capture fine-grained perceptual judgments. To isolate the contribution of persona conditioning, we additionally evaluate the same model without personas. Surprisingly, the no-persona model sometimes matches or exceeds persona-conditioned agreement with human labels across all task variants, suggesting that simple label-based persona prompting may add limited annotation value in this setting.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.