Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior
Ali Aghazadeh Ardebili, Massimo Stella
TLDR
This paper introduces Cognitive Digital Shadows (CDS), a large synthetic corpus for analyzing how LLMs debate societal issues when shadowing human traits.
Key contributions
- Introduces Cognitive Digital Shadows (CDS), a 190,000-record synthetic corpus for LLM discourse analysis.
- LLM responses generated by 19 models, shadowing human personas or AI assistants on 4 controversial topics.
- Persona data links LLM prompts, language, stances, and reasoning via 17 sociodemographic attributes.
- User-friendly platform enables interactive comparisons of emotional and semantic framing across personas.
Why it matters
LLMs significantly influence social discourse, making it crucial to understand how their outputs vary with social context. This paper offers a unique dataset and framework to investigate LLM behavior. It enables critical audits of LLM bias, sensitivity, and alignment, fostering more responsible AI development.
Original Abstract
Large Language Models (LLMs) can strongly shape social discourse, yet datasets investigating how LLM outputs vary across controlled social and contextual prompting remain sparse. Cognitive Digital Shadows (CDS) is a 190,000-record synthetic corpus supporting analyses of LLM-generated discourse. Each CDS record is generated by one of 19 LLMs, prompted to shadow either a human persona or an AI-assistant role. CDS contains LLM responses on 4 controversial societal topics: vaccines/healthcare, social media disinformation, the gender gap in science, and STEM stereotypes. Persona-conditioned records encode 17 sociodemographic and psychological attributes, providing data linking LLMs' prompts, language, stances and reasoning. Texts are validated for topic anchoring and can support emotional analyses via interpretable NLP (e.g. textual forma mentis networks). CDS is enriched by a pooling platform with user-friendly dashboards, enabling easy, interactive group-level comparisons of emotional and semantic framing across personas, topics and models. The CDS prompting framework supports future audits of LLMs' bias, social sensitivity and alignment.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.