When RAG Chatbots Expose Their Backend: An Anonymized Case Study of Privacy and Security Risks in Patient-Facing Medical AI
Alfredo Madrid-García, Miguel Rujas
TLDR
Patient-facing medical RAG chatbots can expose sensitive backend configurations and patient conversations, posing significant privacy and security risks.
Key contributions
- Sensitive RAG system configurations and backend endpoints were exposed via client-server communication.
- Browser tools allowed collection of model/embedding config, retrieval parameters, and knowledge-base content.
- The 1,000 most recent patient-chatbot conversations, including health queries, were retrievable without authentication.
- Critical privacy and security failures were identified using standard browser inspection, requiring no specialist skills.
Why it matters
This paper highlights critical privacy and security vulnerabilities in patient-facing medical RAG chatbots, demonstrating how easily sensitive data and system configurations can be exposed. It underscores the urgent need for rigorous independent security reviews before deploying generative AI in healthcare, as current practices fall short of privacy assurances.
Original Abstract
Background: Patient-facing medical chatbots based on retrieval-augmented generation (RAG) are increasingly promoted to deliver accessible, grounded health information. AI-assisted development lowers the barrier to building them, but they still demand rigorous security, privacy, and governance controls. Objective: To report an anonymized, non-destructive security assessment of a publicly accessible patient-facing medical RAG chatbot and identify governance lessons for safe deployment of generative AI in health. Methods: We used a two-stage strategy. First, Claude Opus 4.6 supported exploratory prompt-based testing and structured vulnerability hypotheses. Second, candidate findings were manually verified using Chrome Developer Tools, inspecting browser-visible network traffic, payloads, API schemas, configuration objects, and stored interaction data. Results: The LLM-assisted phase identified a critical vulnerability: sensitive system and RAG configuration appeared exposed through client-server communication rather than restricted server-side. Manual verification confirmed that ordinary browser inspection allowed collection of the system prompt, model and embedding configuration, retrieval parameters, backend endpoints, API schema, document and chunk metadata, knowledge-base content, and the 1,000 most recent patient-chatbot conversations. The deployment also contradicted its privacy assurances: full conversation records, including health-related queries, were retrievable without authentication. Conclusions: Serious privacy and security failures in patient-facing RAG chatbots can be identified with standard browser tools, without specialist skills or authentication; independent review should be a prerequisite for deployment. Commercial LLMs accelerated this assessment, including under a false developer persona; assistance available to auditors is equally available to adversaries.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.