ShortSpan.ai logo

Mon, Sep 15, 2025

Browse our complete archive of AI security news and analysis.

Filter by category:

Archive

Articles older than one month, grouped by month.

August 2025

New Defense Exposes Flaws in LLM Tool Chains Defenses
Thu, Aug 14, 2025 • By Clara Nyx

New Defense Exposes Flaws in LLM Tool Chains

A new defense framework, MCP-Guard, defends LLMs that call external tools from prompt injection and data leaks. The paper introduces a three-stage pipeline and a 70,448-sample benchmark. It reports a 96.01% detector accuracy and an overall 89.63% pipeline accuracy, promising practical protection for real deployments.

AI Fingerprinting Advances Force Practical Defenses Defenses
Tue, Aug 12, 2025 • By James Armitage

AI Fingerprinting Advances Force Practical Defenses

New research shows automated methods can identify which LLM produced text with high accuracy using only a handful of targeted queries. The study also demonstrates a practical semantic-preserving filter that drastically reduces fingerprinting success while keeping meaning. This raises immediate privacy risks and offers a usable mitigation for deployed systems.

Researchers Expose Few-Query Attacks on Multi-Task AI Attacks
Mon, Aug 11, 2025 • By Elise Veyron

Researchers Expose Few-Query Attacks on Multi-Task AI

New research shows practical black-box attacks that use only a few dozen to a few hundred queries to fool multi-task AI services. The method transfers adversarial text across tasks like translation, summarization, and image generation, affecting commercial APIs and large models. This raises urgent operational risks for public-facing AI systems and content pipelines.

Thinking Mode Raises Jailbreak Risk, Fixable Fast Attacks
Mon, Aug 11, 2025 • By Lydia Stratus

Thinking Mode Raises Jailbreak Risk, Fixable Fast

New research finds that enabling chain-of-thought "thinking mode" in LLMs increases jailbreak success, letting attackers coax harmful outputs. The paper shows longer internal reasoning and educational-style justifications make models vulnerable, and introduces a lightweight "safe thinking intervention" that meaningfully reduces risk in real deployments.

Reinforcement Learning Improves Autonomous Pentest Success Pentesting
Mon, Aug 11, 2025 • By Rowan Vale

Reinforcement Learning Improves Autonomous Pentest Success

New Pentest-R1 shows that combining offline expert walkthroughs with online interactive training helps smaller AI agents perform real multi-step penetration tests. The system raises success rates and cuts token use, but absolute performance stays modest. This matters for defenders who want automated, repeatable tests and for risk managers worried about misuse.

Secure Your Code, Fast: Introducing Automated Security Reviews with Claude Code Enterprise
Thu, Aug 07, 2025 • By Dave Jones

Secure Your Code, Fast: Introducing Automated Security Reviews with Claude Code

This article explores Anthropic’s Claude Code, an AI-driven tool designed to automate security code reviews. Authored by Anthropic researchers, Claude Code highlights the potential for AI to augment security workflows by identifying vulnerabilities quickly and consistently. The discussion balances its practical benefits against inherent risks such as over-reliance and false positives, providing security pros with actionable insights for safe AI integration.

Program Analysis Stops Prompt Injection in AI Agents Defenses
Mon, Aug 04, 2025 • By Dr. Marcus Halden

Program Analysis Stops Prompt Injection in AI Agents

AgentArmor treats an AI agent's runtime trace like a small program, analyzing data and tool calls to spot prompt injection. Tests show strong detection with high true positives and low false alarms, cutting attack success dramatically. Practical limits include dependency errors and extra runtime cost before enterprise deployment.

Researchers Outsmart LLM Guards with Word Puzzles Attacks
Mon, Aug 04, 2025 • By Adrian Calder

Researchers Outsmart LLM Guards with Word Puzzles

New research shows a simple trick, turning harmful prompts into familiar word puzzles, lets attackers bypass modern LLM safety filters. The method, PUZZLED, masks keywords as anagrams, crosswords or word searches and achieves high success across top models, exposing a practical weakness in reasoning-based defenses that organizations must address.

New Cybersecurity LLM Promises Power, Raises Risks Enterprise
Fri, Aug 01, 2025 • By James Armitage

New Cybersecurity LLM Promises Power, Raises Risks

A new instruction-tuned cybersecurity LLM, Foundation-Sec-8B-Instruct, is publicly released and claims to outperform Llama 3.1 and rival GPT-4o-mini on threat tasks. It promises faster incident triage and smarter analyst assistance, but limited transparency on training data and safeguards raises real-world safety and misuse concerns for defenders.

LLMs Automate Penetration Tasks, Exposing Infra Weaknesses Pentesting
Fri, Aug 01, 2025 • By Lydia Stratus

LLMs Automate Penetration Tasks, Exposing Infra Weaknesses

New research shows a modern LLM can autonomously solve most beginner capture-the-flag tasks, finding files, decoding data, and issuing network commands with human-speed accuracy. That success lowers the skills barrier for attackers and exposes specific infrastructure gaps. Operators must apply practical hardening to endpoints, GPUs, vector stores, secrets and data paths now.

July 2025

June 2025

May 2025

April 2025

March 2025

February 2025

January 2025

November 2024