LLMs Mislead XR Devices in New Study

Attacks

Published: Fri, Sep 19, 2025 • By Theo Solander

New research demonstrates that integrating Large Language Models (LLMs) into extended reality (XR) systems opens a novel attack surface. Attackers can alter the public context around legitimate model queries to produce misleading visuals or sounds, risking user safety and privacy. The work shows real proof‑of‑concept attacks and suggests practical mitigations for developers and platforms.

A recent analysis finds that coupling Large Language Models (LLMs) with extended reality (XR) systems creates practical and repeatable security risks. Researchers show that third party code or environmental inputs can tamper with the context an LLM uses, leading the device to display or speak incorrect information that can confuse users, leak data or create safety hazards.

The finding matters because XR devices increasingly rely on LLMs for scene understanding, instruction following and on‑the‑fly content generation. For security teams and decision makers the result is a new, compositional attack surface: compromises do not need to break the model or the operating system, they only need to influence what the model sees as context.

How the attacks work

In the experiments researchers map common system patterns and show end‑to‑end exploits across multiple commercial XR stacks and models. The common thread is a manipulative actor who injects or modifies public context around an otherwise legitimate prompt. That can be a poisoned metadata field, a changed environmental label, timing tricks, or external code that alters prompts before the model runs.

The effects include denial of service for AR overlays, misleading visual or auditory cues that impair situational awareness, user interface manipulation, and avenues for covert data exfiltration. The attacks exploit system design choices: large shared contexts, mutable public inputs, and tight coupling between semantic model output and rendering code.

The researchers offer an initial defence prototype and a set of practical countermeasures. Recommended approaches include integrity checks for contextual inputs, prompt hardening and keeping system prompts private, isolating modelling prompts from rendering prompts, robust input/output validation, and platform‑level protections that restrict what third party libraries can alter.

There are limits. The study exercises representative stacks and single‑user scenarios; it does not cover every vendor mitigation or long‑term human factors. The demonstrations are proof of concept rather than wide‑scale field studies.

For practitioners the takeaway is straightforward: treat LLM context as part of your trusted computing base, and minimise the attack surface by hardening context, validating outputs, and applying platform controls. Vendors must follow: this is an architectural problem that needs platform‑level mitigations, not just model tweaks.

Expect more research and tooling to follow. Until then, developers should assume that anything public facing that feeds an LLM can be adversarial and design accordingly.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

Evil Vizier: Vulnerabilities of LLM-Integrated XR Systems

Authors: Yicheng Zhang, Zijian Huang, Sophie Chen, Erfan Shayegani, Jiasi Chen, and Nael Abu-Ghazaleh

Extended reality (XR) applications increasingly integrate Large Language Models (LLMs) to enhance user experience, scene understanding, and even generate executable XR content, and are often called "AI glasses". Despite these potential benefits, the integrated XR-LLM pipeline makes XR applications vulnerable to new forms of attacks. In this paper, we analyze LLM-Integated XR systems in the literature and in practice and categorize them along different dimensions from a systems perspective. Building on this categorization, we identify a common threat model and demonstrate a series of proof-of-concept attacks on multiple XR platforms that employ various LLM models (Meta Quest 3, Meta Ray-Ban, Android, and Microsoft HoloLens 2 running Llama and GPT models). Although these platforms each implement LLM integration differently, they share vulnerabilities where an attacker can modify the public context surrounding a legitimate LLM query, resulting in erroneous visual or auditory feedback to users, thus compromising their safety or privacy, sowing confusion, or other harmful effects. To defend against these threats, we discuss mitigation strategies and best practices for developers, including an initial defense prototype, and call on the community to develop new protection mechanisms to mitigate these risks.

🔍 ShortSpan Analysis of the Paper

Problem

Extended reality XR applications increasingly integrate large language models LLMs to enhance user experience, scene understanding and even generate executable XR content. This convergence creates new security risks as the XR LLM pipeline can be manipulated by attackers who alter the public context surrounding legitimate LLM queries, leading to erroneous visual or auditory feedback that jeopardises user safety or privacy. The work defines an evil vizier threat model and demonstrates real world proof of concept attacks across multiple devices and models, while proposing mitigation strategies and best practices for developers.

Approach

The authors survey LLM integrated XR systems and categorise them by purpose and system attributes, then build a unified threat model. They perform end to end proof of concept attacks on four XR systems using different LLMs, including Meta Quest 3, Meta Ray Bans, Android based XR and Microsoft HoloLens 2 with Llama and GPT models. The analysis concentrates on client side threats where a third party library with seemingly legitimate functionality can influence LLM responses by manipulating public context or environmental inputs. An initial defence prototype and a set of mitigation strategies are discussed, emphasising context integrity checks, prompt hardening, isolation between modelling and rendering prompts, robust input output validation and platform level protections.

Key Findings

Demonstrated end to end proof of concept attacks on four XR platforms showing that attackers can manipulate public context around legitimate LLM queries to cause denial of service, user confusion and misleading outputs that affect safety and perception.
Attack categories include adversarial prompt injection via environmental changes, timing and prompt manipulation in real time, manipulation of metadata in XR objects and pro external code generation attacks that lead to user interface attacks and data exfiltration.
Despite different platform implementations the attacks share a common vulnerability pattern under a unified threat model in which third party libraries can influence LLM inputs or outputs by tampering with surrounding context or environment.

Limitations

Measurements cover four representative stacks and single user task centric scenarios; the study does not evaluate the full XR ecosystem, all hardware or operating systems, or vendor based mitigations. Defence prototypes are system level patterns rather than vendor integrated protections. Human factors and longitudinal field risks are not explored.

Why It Matters

The work exposes a new attack surface arising from LLM integrated XR devices, including manipulation of audiovisual feedback and generated content that can mislead, cause safety issues or invade privacy. It provides practical mitigation guidance for developers including avoiding public event triggers, keeping system prompts private and defensive, separating semantic from pixel based content, and employing the latest capable LLMs with safeguards. The authors call on the community to develop additional protections to secure next generation LLM integrated XR systems and mitigate real world risks to users and society.

Attribution Original paper on arXiv