Attacks

June 2026

Fri, Jun 05, 2026 • By Rowan Vale

Feature-vocoder adversarial attack breaks black-box ASR

New research targets self-supervised features, not raw waveforms, to craft adversarial audio that transfers across automatic speech recognition systems and evades waveform-focused defences. Built on a public Whisper-small surrogate, the attack boosts black-box WER by 26.6 points on average, stays effective under adversarial training and input purification, and survives over-the-air playback.

June 2026

Feature-vocoder adversarial attack breaks black-box ASR

Posterior Attack Turns LLM Safety Checks Against Them

SlotGCG finds jailbreak sweet spots inside LLM prompts

AI pushes attacks deeper; ATT&CK misses agentic risk

ImageAuditor breaks image RAG with membership inference

Latent Geometric Chords slip past robust vision models

May 2026

Bandit-guided style tweaks game LLM judge scores

CodecAttack slips past audio LLM compression defences

Edge AI accelerators misused as confused deputies

Cross-Modal Backdoors Exploit Multimodal LLM Connectors

Finite-width attack cleanly reconstructs training data

Misrouter steers MoE LLMs into unsafe outputs

Adversarial images hijack VLMs and launder authority

April 2026

RLVR jailbreaks fool safety checks while SFT drifts

Jailbreak Seeds Harmful Reasoning While Answers Stay Clean

Verifiable Gradient Inversion Breaks Federated Tabular Privacy

Mosaic breaks closed VLMs with multi-view ensembles

Hidden visual prompts steer multimodal LLMs

March 2026

Survey maps multimodal LLM attacks to operational reality

Autoresearch agent finds stronger LLM jailbreak attacks

Comics jailbreak multimodal LLMs at high success rates

EvoJail automates long-tail LLM jailbreaks at scale

Structured visuals quietly jailbreak LVLMs with slot filling

Scores Drive Voice Impersonation on Speaker Recognition

February 2026

Adversarial images hijack LVLMs after long chats

Prefill attacks bypass safeguards in open-weight LLMs

Adversarial tweaks mislead binary code similarity detectors

Contrastive Continual Learning Enables Persistent IoT Backdoors

Training rewards teach models to exploit flaws

MoE models vulnerable to expert silencing attack

Confundo Crafts Robust Poisons for RAG Systems

Single prompt strips safety from LLMs with GRPO

Chat templates enable training-free backdoor attacks

Researchers expose inference-time backdoors in chat templates

Narrative Speech Evades Audio-Language Model Safeguards

January 2026

Persuasive LLM Rewrites Break Automated Fact-Checkers

November 2025

Researchers Expose KV-Cache Trojan Flipping Single Bit

Game-theory jailbreaks expose LLM safety gaps

Poetry Jails Most LLMs in Single Prompt

VEIL Exploits Text-to-Video Models' Hidden Cues

Linguistic Styles Expose New AI Jailbreak Vector

Subtle Word Changes Break LLM Math Reasoning

Reverse-engineering LLM guardrails at low cost

Attackers Break Malware Analysis by Flooding Telemetry

Prompt Injections Hijack AI Paper Reviews

October 2025

Fine-Grained Compute Boosts Adversarial Attack Power

Enhanced Attacks Expose Multimodal LLM Safety Gaps

Benign Reasoning Training Enables Models to Bypass Safety

Study Exposes Multimodal AI Jailbreaks with Simple Tricks

On-device LLMs enable stealthy living-off-the-land attacks

Researchers Expose Simple Ways to Bypass LRM Guardrails

Adaptive Attacks Routinely Bypass Modern LLM Defences

Small poisoned sets can hijack large LLMs

Pruning Unmasks Malicious LLMs in Deployment

Invisible Unicode Steers LLMs into Jailbreaks

Untargeted Jailbreak Attacks Expose LLM Safety Gaps

Attackers Bypass Prompt Guards in Production AI

Single-Bit Flips Break LLM Behaviour in Seconds

Researchers Bypass LLM Fingerprints While Preserving Utility

September 2025

Adversarial Noise Hijacks Speech Enhancement Outputs

New RL method injects stealthy jailbreaks into LLMs

Researchers expose stealthy AI-IDE configuration attacks

LLMs Mislead XR Devices in New Study

Humanoid robots leak data and enable cyber attacks

Lightweight pipeline clones voices and syncs lips

Iterative LLM jailbreaks produce executable attack code

Intelligent adversary outsmarts robot patrols in tests

NeuroStrike exposes neuron-level alignment failures in LLMs

Researchers Expose How Embedded Prompts Manipulate Reviews

Simple Prompt Injections Hijack LLM Scientific Reviews