ArXiv TLDR

From Beats to Breaches:How Offensive AI Infers Sensitive User Information from Playlists

🐦 Tweet
2605.04724

Stefano Cecconello, Mauro Conti, Luca Pajola, Luca Pasa, Pier Paolo Tricomi

cs.CRcs.AI

TLDR

This paper introduces musicPIIrate, an Offensive AI tool that infers sensitive user PII from music playlists, and JamShield, a defense against it.

Key contributions

  • Developed musicPIIrate, an Offensive AI tool that infers sensitive PII from public music playlists.
  • Leverages deep learning (Deep Sets, GNNs) to model playlist data for accurate PII prediction.
  • Achieves state-of-the-art accuracy inferring demographics, habits, and personality traits.
  • Introduces JamShield, a defense that injects dummy playlists to lower inference F1-scores by 10%.

Why it matters

This work exposes a significant privacy vulnerability in music streaming, demonstrating how AI can infer sensitive PII from public playlists. It provides a crucial benchmark for Offensive AI in this domain and offers a promising defense mechanism. This research is vital for understanding and mitigating emerging AI-driven privacy risks.

Original Abstract

The pervasive integration of AI has enabled Offensive AI: the exploitation of AI for malicious ends across the cyber-kill chain. A critical manifestation is the user attribute inference attack, where AI infers sensitive Personally Identifiable Information (PII) from innocuous public data. We explore how music streaming ecosystems, where users routinely release public playlists, can be exploited for Offensive AI. To quantify this threat, we developed musicPIIrate. This novel tool leverages deep learning architectures that utilize both standalone data representations and the structural information embedded in a user's playlist collection. Our design explores set-based approaches (e.g., Deep Sets) and methodologies modeling relationships between playlists (e.g., Graph Neural Networks), which we also combine to leverage both perspectives. Our approach addresses feature extraction from unordered, variable-length set data, enabling accurate PII prediction. Empirical evaluation demonstrates that musicPIIrate achieves state-of-the-art inference accuracy. The tool successfully infers a wide array of attributes, including: Demographics (Age, Country, Gender), Habits (Alcohol, Smoke, Sport), and Personality Traits (OCEAN scores). musicPIIrate outperforms existing methods, beating baselines in 9 out of 15 attribute inference tasks. To counter this vulnerability, we propose JamShield, a lightweight defensive framework. JamShield strategically injects dummy playlists into an account to dilute the PII-carrying signal. Our analysis indicates that JamShield represents a promising defense, lowering inference F1-scores by an average of 10%. This work provides an initial Offensive-AI benchmark for playlist-based PII inference using architectures that leverage set- and graph-structured data and introduces a defense showing encouraging mitigation effects.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.