SoK: Analysis of Privacy Risks and Mitigation in Online Propaganda Detection through the PROMPT Framework
Dhiman Goswami, Al Nahian Bin Emran, Md Hasan Ullah Sadi, Sanchari Das
TLDR
This paper introduces PROMPT, a framework to analyze privacy risks and mitigation in online propaganda detection, revealing compliance issues and privacy-utility trade-offs.
Key contributions
- Formalizes privacy risks and mitigation in propaganda detection using the PROMPT framework.
- Introduces a compliance score to audit existing methods against regulations like GDPR/CCPA.
- Shows many current propaganda detection pipelines are non-compliant, especially in metadata.
- Quantifies privacy-utility trade-offs, showing performance drops with increased privacy.
Why it matters
Online propaganda detection is crucial but often overlooks user privacy. This paper provides a much-needed framework and tools to build systems that are both effective and compliant with privacy regulations. It quantifies the privacy-performance trade-off, guiding future development.
Original Abstract
Online propaganda detection pipelines expose measurable privacy risks at multiple stages including data collection, feature extraction, and model inference. We conduct a structured analysis of $162$ peer-reviewed studies and formalize the problem using the Propaganda Risk Online Mitigation and Privacy-preserving Tactics (PROMPT) framework. PROMPT models risks $R$ and mitigation strategies $S$ through a mapping $M: R\to S$ guided by a utility function $α\cdot \mathrm{PrivacyGain}(s_j) - β\cdot \mathrm{PerfLoss}(s_j) - γ\cdot \mathrm{Cost}(s_j)$, with tunable $(α,β,γ)$ enabling stakeholders to balance privacy, accuracy, and deployment costs. To assess practical adoption, we introduce a compliance score that quantifies the alignment of existing methods with GDPR, CCPA etc. requirements. Our evaluation shows that many widely used pipelines remain non-compliant, particularly in metadata handling and user-level aggregation. We further present empirical fine-tuning experiments on transformer-based encoders and decoders under synthetic perturbation, demonstrating a monotonic privacy-utility trade-off: with $q = 0.05$ performance decreased by 1-2% F$_1$, while at $q = 0.20$ the reduction reached 13-14%. These results establish quantitative baselines for privacy costs in propaganda detection. Our contributions include a formal risk-to-defense mapping, a compliance-oriented auditing metric, and experimental evidence of privacy-performance trade-offs, providing a technical foundation for building regulation-compliant and privacy-aware detection systems.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.