ShortSpan.ai logo

EM side-channel flags hijacked LLM agent workflows

Agents
Published: Fri, May 08, 2026 • By Clara Nyx
EM side-channel flags hijacked LLM agent workflows
ClawGuard monitors electromagnetic noise from a host to spot hijacked Large Language Model (LLM) agent workflows, even if the OS lies. Using software-defined radios and a drift-aware pipeline, it reports AUC 0.9945 with 100% true positives and 1.16% false positives on 11,800 records. Clever, practical in niche setups, not a panacea.

LLM agents don’t just chat; they pull data, run tools, and act. That workflow can be hijacked: attackers insert, reorder, or swap skills while keeping the final answer plausible. If the host is compromised, your logs happily corroborate the lie. This paper asks a rude but useful question: can we check the workflow from outside the box?

ClawGuard does exactly that by watching the machine’s electromagnetic (EM) noise. Two nearby software-defined radios (SDRs) listen to the chassis while the agent runs. Each agent skill has a different mix of CPU, memory, I/O, and idle time, which produces a measurable, seconds-scale EM “envelope.” After a per-rig calibration to pick CPU- and memory-correlated frequency bands, the system treats those envelopes as physical evidence of what actually ran.

How it works

The pipeline is drift-aware. It slices each skill into overlapping fine windows inside a coarse envelope, then computes 320 features spanning spectral, temporal, and cross-receiver signals. It normalises per cycle, removes temperature drift, and does feature selection within the training fold. Ensemble classifiers score each window for both skill and attack state; those decisions roll up to a record-level verdict. The evaluation used HackRF One SDRs with temperature logging.

On a 7.82 TB RF corpus with 12,232 records, the production split (11,800 records) hit ROC AUC 0.9945 and PR AUC 0.9305. They pick an operating point with 100% true-positive rate and 1.16% false-positive rate. Replicating after measured carrier re-selection to an 80 MHz and 800 MHz pair yielded 83.6% sub-window accuracy, 88.3% record-vote accuracy, and 90.3% record-level attack recall on the surviving classes. Median post-feature inference latency was 18 ms (p99 29 ms), with batched inference around 0.15 ms per record. They include cost and data-rate notes for planning.

Where it breaks

This is not a magic rootkit detector. It needs physical RF access and per-deployment calibration because useful bands are hardware- and environment-specific. Thermal and cycle drift are loud enough to drown class signals, which makes broad, open-set recognition brittle. The threat model excludes physical tampering and active jamming. An adaptive attacker could try to mimic benign envelopes, chop work into tiny bursts, or fiddle dynamic voltage and frequency scaling (DVFS) to move the harmonics.

So does it matter? Yes, in the narrow but important case where you fear host compromise and can control the physical environment. In a shared rack with RF noise and no proximity guarantee, this will creak. The team also sketches sequence-level edit-distance checks but stops short of scale testing. Still, as a forge-resistant, out-of-band integrity signal for LLM agent workflows, this is one of the few ideas that puts physical reality back in the loop—and the numbers say it’s more than a lab toy.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

ClawGuard: Out-of-Band Detection of LLM Agent Workflow Hijacking via EM Side Channel

Authors: Leo Linqian Gan, Jeffery Wu, Longyuan Ge, Lanqing Yang, Yonghao Song, Jingkai Zhang, Haojia Jin, Weiyi Wang, and Guangtao Xue
Autonomous LLM agents face a critical security risk known as workflow hijacking, where attackers subtly alter tool and skill invocations. Existing defenses rely on host-internal telemetry (such as audit logs), which can be forged if the host OS is compromised. To solve this, we introduce ClawGuard, a passive, out-of-band monitor that audits LLM-agent workflows using electromagnetic (EM) emanations. Because distinct agent skills create unique hardware usage patterns (computation, DRAM, network blocking), they emit measurable, macroscopic EM envelopes. External software-defined radios (SDRs) capture these physical signals. Using a drift-aware pipeline with 320-dimensional features, ClawGuard converts RF streams into physical evidence. Evaluated on a 7.82TB RF corpus, ClawGuard achieved an AUC of 0.9945, detecting attacks with a 100% true-positive rate and a 1.16% false-positive rate. This proves passive EM sensing is a practical, forge-resistant physical check against compromised host software.

🔍 ShortSpan Analysis of the Paper

Problem

The paper studies workflow hijacking in autonomous LLM agents, where an attacker inserted, omitted, reordered or substituted tool and skill invocations while preserving plausible high-level semantics. This is critical because conventional defences rely on host-internal telemetry such as audit logs or provenance graphs, which can be forged if the host OS or agent runtime is fully compromised. The work asks whether a passive, out-of-band physical channel can provide a forge-resistant integrity check of agent workflows.

Approach

ClawGuard is an out-of-band monitor that passively captures electromagnetic emanations from a target host using two external software-defined radios positioned close to the chassis. The system treats agent skills as seconds-scale compositional workloads whose mixtures of CPU, DRAM, I/O and idle intervals produce macroscopic EM envelopes. A measured carrier-selection calibration is performed per deployment to pick complementary CPU- and memory-correlated bands. A drift-aware coarse–fine pipeline extracts overlapping fine windows inside coarse skill envelopes and computes 320-dimensional spectral, temporal and cross-receiver features. Preprocessing includes cycle-local normalisation, temperature detrending and training-fold-only ANOVA feature selection. Inference uses ensemble classifiers to produce fine-window skill and attack-state evidence, which is aggregated into a record-level verdict. Evaluations use a large RF corpus collected with HackRF One SDRs and temperature logging.

Key Findings

  • Large RF corpus: ClawGuard was evaluated on a 7.82 TB RF corpus with 12,232 records covering 16 benign skills and 22 attack skills, plus a separate replication corpus collected after carrier re-selection.
  • High detection performance: On the production split (11,800 records) ClawGuard achieved ROC AUC 0.9945 and PR AUC 0.9305, with an operating point giving 100% true-positive rate and 1.16% false-positive rate.
  • Cross-band replication: After measured carrier re-selection to an (80 MHz, 800 MHz) pair, the same pipeline reached 83.6% sub-window accuracy, 88.3% record-vote accuracy and 90.3% record-level attack recall on the surviving class subset.
  • Coarse–fine benefit: Decomposing records into fine windows substantially improved detection of short malicious payloads that would be diluted by whole-record features; one coarse–fine configuration reported record accuracies from 0.9252 to 0.9398 and attack recall around 0.83–0.86.
  • Operational practicality: Median post-feature inference latency was 18 ms (p99 29 ms); batched inference amortises to ~0.15 ms per record. System cost and data-rate estimates were provided for deployment planning.

Limitations

ClawGuard requires physical RF access and per-rig carrier calibration because informative bands are deployment-specific. Thermal and cycle drift are substantial and can dominate class information, limiting open-set multi-class recognition; large flat classifiers across many skills are fragile under cross-run shifts and small per-class sample sizes. The system assumes the defender controls the SDRs and policy channel; physical-layer attacks such as sensor tampering, active jamming or host-controlled emitters are out of scope. Sequence-level edit-distance verification was designed but not exhaustively evaluated at scale.

Implications

As an offensive implication, adversaries who can compromise the host software can still be detected by an out-of-band EM monitor, but a motivated attacker could adapt to evade ClawGuard by crafting payloads that mimic benign EM envelopes, fragmenting operations into very short bursts, manipulating DVFS or governor settings to alter harmonics, or attempting physical jamming or sensor tampering. These adaptive strategies indicate that attackers might raise the bar but not trivially defeat a properly calibrated out-of-band monitor; ClawGuard therefore provides a hard-to-forge physical integrity signal that complements host telemetry.


Related Articles

Related Research

Get the Weekly AI Security Digest

Top research and analysis delivered to your inbox every week. No spam, unsubscribe anytime.