Block Rogue AI Agents with Context Aware Policies

Agents

Published: Mon, Sep 29, 2025 • By Rowan Vale

CSAgent constrains Large Language Model (LLM)-based computer-use agents with static, intent- and context-aware policies enforced at the OS level. The paper reports it blocks over 99.36% of attacks in a benchmark while adding about 6.83% latency and modest task utility loss. An automated policy toolchain supports API, CLI and GUI protection.

Large Language Model (LLM) means a neural network trained to predict text; here that capability is used to drive tools that operate a computer. That convenience creates a real risk: agents acting with user-level permissions can execute irreversible or unwanted actions when their intent inference or prompts go wrong. The paper proposes CSAgent, a system-level, static policy framework that ties policies to both declared intent and runtime context rather than relying on repeated user prompts or ad hoc runtime checks.

How CSAgent works

CSAgent defines intent- and context-aware policies in a readable JSON format and enforces them via an operating system service. A development toolchain helps generate initial policies by analysing applications. For graphical user interfaces the toolchain extracts GUI event handlers and builds call graphs; for command line and API usage it relies on existing specifications. At runtime the service maintains a context vector and validates each agent function call against the relevant policy. If a check fails the system can fall back to user interaction or contextual guidance so work is not lost.

The approach blends static policy with dynamic context by indexing rules by function and declared intent. The authors report an automated context analyser based on LLMs finds substantially more GUI elements than prior methods, yielding richer policy inputs. Context spaces and policies are stored in JSON, and the implementation uses caching and categorises contexts by update frequency to keep enforcement fast.

On performance and effectiveness the paper reports CSAgent defends against more than 99.36% of attacks in the AgentDojo benchmark while introducing roughly 6.83% average latency overhead and a 9.33% decline in task utility. About 80% of context spaces load within five seconds in their measurements. These figures suggest the approach is practical for many real deployments, trading small latency and utility costs for a large reduction in agent misbehaviour.

Limitations are clear and acknowledged. CSAgent depends on LLMs for intent extraction and policy generation, so omissions or mistakes in the generated policies reduce coverage and require human review and iterative refinement. The evaluation focuses on single-agent scenarios and assumes a trusted OS and applications, so it does not address threats in model training pipelines or hostile infrastructure. Some GUI and web situations remain hard to automate fully and may need manual policy work.

Practical controls and trade-offs

If you run or plan to deploy computer-use agents, CSAgent points to a pragmatic path: move enforcement into the OS with static, auditable policies that reference runtime context. Start small, automate policy generation where it helps, and require expert review for high-risk functions. Expect a modest performance cost but a large reduction in risky actions.

Minimal viable checklist: draft intent-taxonomy for sensitive functions; generate initial policies automatically; require human sign-off for destructive ops; deploy the enforcement service in a controlled test environment. Good-better-best rollout options:

Good: block dangerous calls by default and prompt for safe ones;
Better: use context vectors to allow common workflows without prompts;
Best: combine automated policy generation with periodic expert review and policy evolution feedback.

CSAgent is not a silver bullet, but it provides a clear, auditable layer that reduces dependence on brittle runtime prompting and runtime LLM validation. For teams wrestling with agents that must control real systems, its mix of static policy, context awareness and tool-assisted policy generation is worth evaluating. Keep manual review in the loop and treat policies as living artefacts.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

Secure and Efficient Access Control for Computer-Use Agents via Context Space

Authors: Haochen Gong, Chenxiao Li, Rui Chang, and Wenbo Shen

Large language model (LLM)-based computer-use agents represent a convergence of AI and OS capabilities, enabling natural language to control system- and application-level functions. However, due to LLMs' inherent uncertainty issues, granting agents control over computers poses significant security risks. When agent actions deviate from user intentions, they can cause irreversible consequences. Existing mitigation approaches, such as user confirmation and LLM-based dynamic action validation, still suffer from limitations in usability, security, and performance. To address these challenges, we propose CSAgent, a system-level, static policy-based access control framework for computer-use agents. To bridge the gap between static policy and dynamic context and user intent, CSAgent introduces intent- and context-aware policies, and provides an automated toolchain to assist developers in constructing and refining them. CSAgent enforces these policies through an optimized OS service, ensuring that agent actions can only be executed under specific user intents and contexts. CSAgent supports protecting agents that control computers through diverse interfaces, including API, CLI, and GUI. We implement and evaluate CSAgent, which successfully defends against more than 99.36% of attacks while introducing only 6.83% performance overhead.

🔍 ShortSpan Analysis of the Paper

Problem

Large language model based computer use agents enable natural language control of system and application level functions, but their inherent uncertainty and vulnerabilities such as prompt injection and hallucinations create security risks when agents gain control of a computer. Since these agents operate with permissions close to those of users, actions may diverge from user intent, including irreversible operations. Existing mitigations such as user confirmations and LLM based dynamic action validation improve safety but degrade usability and performance. The paper addresses these challenges with CSAgent, a system level static policy based access control framework that constrains computer use agents using intent and context aware policies and provides an automated toolchain to help developers construct and refine them. CSAgent enforces policies through an optimised OS service, ensuring agent actions execute only under specific user intents and contexts, and it supports guarding agents across API, CLI and GUI interfaces.

Approach

CSAgent delivers a static policy based security framework that bridges static policy with dynamic user intent via intent aware context spaces, a per application hierarchical policy structure that indexes rules by function and intended use. It specifies a formal policy format and employs a CSAgent OS service to enforce policies at runtime. A development phase toolchain, including a context analyser, aids policy construction by generating context spaces and policies, while a runtime phase RPC based service loads the relevant context spaces, maintains a context vector and validates each function call before execution. If validation fails, the system can fall back to user prompts or contextual guidance to preserve task progress. For GUI apps, CSAgent extracts GUI event handlers and builds call graphs to create a semantic knowledge base for policy generation; for API and CLI apps, well defined specifications support policy generation. A policy evolution framework updates context spaces in response to app changes and runtime feedback. CSAgent is implemented in Python as a two component solution: a policy generation toolchain and an OS protection service, with context spaces stored in JSON. Performance optimisations include parallel context extraction, an optimised context manager that categorises contexts by update frequency into cold, warm and hot, and an LRU caching mechanism to reduce context space loading overhead. Policy retrieval maps user intents and functions to policies, and constraints are expressed as logical relations over context values with human readable guidance provided when constraints fail.

Key Findings

CSAgent defends against more than 99.36 per cent of attacks in the AgentDojo benchmark, while incurring 6.83 per cent average latency overhead and a 9.33 per cent decline in task utility.
The LLM based context analyser identifies between 1.93 and 4.12 times more GUI elements than existing approaches, yielding a richer semantic knowledge base for automated policy generation; CSAgent also identifies 0.93 times as many GUI elements as AutoDroid and 3.12 times more than UI CTX in relevant comparisons.
Context loading and policy enforcement are efficient: about 80 per cent of context spaces load within five seconds, with caching making retrieval overhead negligible; average policies per application and context counts vary across benchmarks, reflecting app complexity.
Policy generation in development yields substantial automation, with policies stored in JSON and retrieved by intent and function, enabling consistent enforcement across different agent modalities and models.

Limitations

The framework relies on LLMs for intent extraction and policy generation, so inaccuracies or omissions in policy content can affect effectiveness; continuous policy refinement is necessary and manual expert review remains important. While static policies provide consistency and auditability, some real world GUI and web scenarios pose challenges for automated GUI policy generation, and the evaluation focuses on single agent use cases, leaving multi agent interactions for future work. Some benchmarks did not achieve full defence in a single analysis, requiring policy evolution feedback to close gaps. The approach also assumes trusted OS and apps and does not address LLM training or deployment infrastructure threats.

Why It Matters

CSAgent offers a practical OS level defence that preserves agent autonomy and efficiency while substantially reducing the risk of misbehaviour by AI powered computer use agents. By enforcing intent and context based access control through static policies and an automated policy toolchain, it reduces reliance on user prompts and dynamic runtime policy generation, improving usability and performance. The approach supports protecting agents across API, CLI and GUI interfaces, providing a unified, auditable and extensible framework for safer AI enabled automation with tangible security benefits for real systems.

Attribution Original paper on arXiv