RAG Medical Chatbot Leaks Backend and Patient Chats
Agents
Patient-facing chatbots built on retrieval-augmented generation (RAG) promise grounded answers. This one also handed out its keys. In a live deployment aimed at patients, ordinary browser inspection exposed the system prompt, Large Language Model (LLM) and embedding choices, retrieval parameters, backend endpoints, knowledge-base contents and a dump of the latest 1,000 conversations. No login, no specialist tools.
How the exposure worked
The assessment ran in two passes. First, a commercial LLM (Claude Opus 4.6) helped shape probes, including a false developer persona to nudge “debug” output. Then came manual verification using Chrome Developer Tools. Watching the network traffic and cracking open JSON responses was enough. The frontend received configuration objects that belonged on the server: operative prompt, active LLM backends, embedder identifiers, similarity thresholds, chunking settings and more.
From those payloads you could pull the API schema and concrete endpoints. Because the browser already had the right origin, you could call those endpoints from the console as any visitor. The knowledge base was enumerable through unauthenticated surfaces: filenames, UUIDs, chunk IDs, full chunk text, similarity scores and embedder model identifiers. Reconstructing original documents was a matter of concatenating chunks.
Worse, a public interface returned the most recent 1,000 patient–chatbot conversation records. That included user questions, model responses, timestamps and keyword tags. The public description said chat histories were not retained; the live system kept full transcripts for months and served them up to anyone who asked the right, very simple way. After a confidential report in April 2026, the deployment was taken offline.
Why it matters
This is not a niche side channel. It is what happens when teams ship RAG by pushing secrets and state to the client for convenience, then hope no one opens DevTools. Once endpoints and schemas are visible, scraping and replay are trivial. With chat transcripts in play, you are not just leaking prompts; you are exposing people asking about diagnoses, medications and fears.
The skill bar here is low: a browser and curiosity. A general-purpose LLM sped up hypothesis generation and even played along with a fake developer persona. That help is not exclusive to auditors. As long as configuration and data ride the client, attackers will ride along.
Additional analysis of the original ArXiv paper
📋 Original Paper Title and Abstract
When RAG Chatbots Expose Their Backend: An Anonymized Case Study of Privacy and Security Risks in Patient-Facing Medical AI
🔍 ShortSpan Analysis of the Paper
Problem
This paper reports a non‑destructive, anonymised security assessment of a publicly accessible patient‑facing medical chatbot built with a retrieval‑augmented generation architecture. It examines whether sensitive backend configuration, knowledge‑base content and stored patient conversations were exposed through ordinary client‑side interactions and why such exposures matter for privacy, trust and regulatory compliance in health AI.
Approach
The authors used a two‑stage read‑only assessment. First, a commercial LLM (Claude Opus 4.6) supported exploratory prompt‑based testing and hypothesis generation, including prompt‑injection style probes framed under a false developer persona. Second, candidate findings were manually verified using standard Chrome Developer Tools by inspecting browser‑visible network traffic, request and response payloads, configuration objects, API schemas and stored interaction endpoints. Only information observable via normal browser use and without additional authentication was documented. Vulnerabilities were reported confidentially to the developers in April 2026 and the deployment was taken offline.
Key Findings
- A critical architectural vulnerability exposed sensitive RAG configuration to the client: ordinary browser inspection revealed the operative system prompt, active LLM backends, embedding model identifier, retrieval strategy, similarity thresholds, chunking parameters and other pipeline configuration fields.
- Backend endpoints and API schema details were discoverable from client‑side payloads, and those endpoints could be queried from the browser console using only the same origin and credentials available to any visitor.
- The knowledge base was fully enumerable via unauthenticated administrative surfaces: original filenames, UUIDs, per‑chunk identifiers, full chunk text, similarity scores and embedder model identifiers were retrievable and full document text could be reconstructed by concatenating chunks.
- Stored conversations were exposed: a public interface returned the most recent 1,000 patient–chatbot conversation records including user questions, model responses, timestamps and keyword classifications, all retrievable without authentication and persistent across months of operation.
- The live deployment contradicted published privacy descriptions: although the public description claimed no chat history retention, full transcripts and metadata were stored and accessible.
- No specialist tooling or authentication was required; a general‑purpose LLM accelerated hypothesis generation and testing, and in this case accepted prompts framed as developer debugging rather than refusing them.
Limitations
The study is a single anonymised case study and its findings are not presented as automatically generalisable to all RAG systems. The assessment was non‑destructive and read‑only; potentially sensitive raw contents, endpoint names and reproduction materials were intentionally omitted. The authors did not attempt to deanonymise users or access server logs.
Implications
Offensive security implications are direct: an attacker with only a browser and access to a consumer LLM could extract system prompts, pipeline configuration, backend endpoints, knowledge‑base content and recent patient queries, then exfiltrate sensitive health information or decontextualise clinical material. Exposed conversation records could be used to profile, target or manipulate vulnerable individuals and undermine patient trust and regulatory compliance. The case emphasises that LLM assistance is dual use: the same capabilities that speed auditing can enable adversaries, and increasingly capable automated tools will broaden the attack surface unless such deployments enforce server‑side controls and restrict client‑visible information.