Agent LLMs Easily Re-identify Interview Participants
Agents
On 4 December 2025 Anthropic published Interviewer, a tool plus a dataset of 1,250 qualitative interviews with professionals, including 125 scientists. A new analysis focuses on the scientist subset and tests whether modern web‑enabled agentic large language models (LLMs) can deanonymise participants by linking interview content to public publications. The short answer is yes, and more easily than many researchers would like.
The researcher applied a two‑step approach. First, a model flagged transcripts that discussed published work. From the 24 scientist transcripts that mentioned at least one project, a second model acting as a web‑capable agent searched the internet, cross‑checked details and returned ranked candidate matches with confidence ratings. Seven transcripts received a very high confidence match; manual checks confirmed six genuine re‑identifications. Reported costs were under $0.50 per transcript and runtimes ran at about four minutes per attempt. The experiments used off‑the‑shelf LLMs with web access and a no‑code consumer variant, showing the attack is low‑effort and accessible.
Why this matters is straightforward. Organisations that publish qualitative datasets generally rely on redaction and anonymisation to protect participants. Those protections assumed adversaries would need domain expertise and time to piece together clues. The arrival of agentic LLMs that can decompose a deanonymisation problem into innocuous web searches and summarisation steps changes that assumption. Seemingly harmless details in a transcript, such as a methodology note, a timeline or a unique project description, can be enough for an agent to propose a likely match. That opens practical harms: reputational damage, doxxing, or a chilling effect where experts decline to take part in interviews because they fear exposure.
Practical trade-offs
This is not a doom‑laden claim that all anonymised qualitative releases are instantly useless. The attack succeeded on a minority of transcripts: six confirmed cases from 24 mentions of publications. But the low cost, speed and accessibility matter. As agents improve and search indexes expand, success rates will rise. Equally, removing too much context to prevent re‑identification destroys dataset utility. Data custodians face a classic privacy versus usefulness trade‑off, now with smarter attackers in the loop.
What to do next
There are practical steps organisations should take. First, treat the presence of public project details in transcripts as a high risk and reassess consent and disclosure language. Second, tighten release controls: consider gated access, stricter vetting, or embargoing transcripts until related publications are public. Third, apply targeted sanitisation that removes or generalises unique identifiers rather than blunt redaction. Finally, monitor the threat landscape: limit automated web access for internal agent tools and require logging and human oversight for searches that could aggregate identifying signals.
In short, this research reminds custodians that anonymisation assumptions have to be reworked for an era of agentic LLMs. The problem is manageable, but it needs deliberate policy changes, better tooling and an acceptance that the old rules for releasing qualitative data no longer hold without extra precautions.
Additional analysis of the original ArXiv paper
📋 Original Paper Title and Abstract
Agentic LLMs as Powerful Deanonymizers: Re-identification of Participants in the Anthropic Interviewer Dataset
🔍 ShortSpan Analysis of the Paper
Problem
The study investigates privacy risks in releasing rich qualitative data for research when modern large language models with web access and agentic capabilities can deanonymise participants. It focuses on the Anthropic Interviewer dataset released in December 2025, which contains 1 250 interviews with professionals, including 125 scientists. The analysis concentrates on the scientist subset to assess whether off the shelf LLM based agents can link interviews to specific scientific works, thereby recovering authors or uniquely identifying interviewees even when data are anonymised. The work argues that such re identification is feasible with widely available tools and a small number of natural language prompts, potentially bypassing safeguards and enabling doxxing or other harms. The author outlines the attack at a high level, discusses implications for releasing qualitative data in the era of LLM agents, and offers mitigation ideas and open problems, having notified Anthropic of the findings.
Approach
The method relies on two main steps applied to the scientist transcripts. First, a non thinking model labels whether interviewees discuss specific published work and counts how many distinct published projects are mentioned, yielding a subset of transcripts that reference at least one publication. Second, a thinking model agent with web search capabilities searches for candidate publications that match the described project and returns a ranked list with a discrete confidence rating (very low to very high) plus brief rationales describing alignment or mis alignment. This process is repeated until no new very high confidence matches are found. The threshold very high is treated as a potential re identification. The attack is demonstrated using both API based LLMs with web access and a no code variant using consumer web enabled LLM services. Operational details intended to scale the attack or circumvent safeguards are intentionally omitted. The experiments rely on publicly available internet information and the use of web augmented LLMs to search, retrieve, and summarise relevant publications. Reported costs are under 0 50 USD per transcript and runtimes of about four minutes per attempt. The author also notes that the method lowers the technical barrier and can be executed by individuals with access to LLM agents and basic prompts.
Key Findings
- Six of twenty four scientist interviews that mention at least one publication could be linked to the corresponding publication, enabling recovery of authors and, in some cases, unique identification of the interviewee.
- Seven transcripts yielded very high confidence matches from the LLM, and manual verification confirmed six re identifications as valid, with the matches aligning on methodology, contributions, timelines and team structures beyond superficial wording.
- In one case where the LLM assigned very high confidence but was not counted as a success, the described project appeared likely to be under review, though the publication landscape still showed a highly overlapping set of authors, indicating probable re identification once the relevant paper is published.
- The re identification can be achieved with low resource use, with costs under a half United States dollar per transcript and a runtime around four minutes, and the no code variant using consumer LLM services proved similarly successful.
- The study underscores that safeguards can be bypassed by decomposing the re identification into benign tasks, leveraging the dual use of information retrieval tools and the unverifiability of user intent, and highlights that redacted information in transcripts can still be recovered in some cases.
Limitations
The analysis is conducted at a high level and does not provide operational prompts or steps that would enable replication or scaling of the attack. It concentrates on the scientist subset of the dataset, with 24 transcripts mentioning a publication and six successful re identifications, so results may differ with broader sampling or different domains. The method relies on public information and current generation LLMs with web access; improvements in models or access to paywalled or internal data could increase risk. Some re identifications were not counted as successes when the project appeared to be in the review stage, though substantial overlap with author sets suggests ongoing risk. The author also shields procedural specifics to avoid facilitating misuse.
Why It Matters
The work raises privacy and surveillance concerns about releasing qualitative data in the age of powerful AI agents. It highlights potential harms including unexpected exposure, emotional distress, reputational damage, relationship harms, and possible policy violations, along with the possibility of chilling effects on sharing qualitative data. The findings point to the need for stronger data anonymisation, more robust access controls, safer agent design (for example restricted web access and auditing), and clearer consent practices. The paper offers mitigation ideas and open problems, including how to balance privacy and data utility in qualitative releases, how to inform participants of residual re identification risks, how to harden safeguards against ambiguous user intent, and how to guide responsible data sharing in AI assisted workflows. The author advocates responsible disclosure and outlines steps such as dataset takedown or temporary hiding and re consent, while noting ongoing data collection and the potential for future datasets to pose similar risks as AI capabilities advance. The discussion emphasises practical implications for researchers, data custodians, and policy makers seeking to protect participant privacy without unduly hampering valuable qualitative research.