ShortSpan.ai logo

Open LLM RedSage Bolsters Local Cybersecurity Assistants

Agents
Published: Fri, Jan 30, 2026 • By Theo Solander
Open LLM RedSage Bolsters Local Cybersecurity Assistants
RedSage is an open, locally deployable Large Language Model (LLM) trained on cybersecurity data and simulated expert workflows. At the 8B scale it measurably improves benchmark performance. The release promises practical defensive assistance but highlights dual-use, data leakage and poisoning risks and calls for strict safety, provenance and access controls.

RedSage is an open, locally deployable Large Language Model (LLM) purpose-built for cybersecurity tasks. The project combines large-scale domain filtering with targeted, agentic augmentation: roughly 11.8 billion tokens of security-focused pretraining data, a curated seed of about 28.6 thousand documents, and 266 thousand multi-turn simulated conversations for supervised fine-tuning. The team also ships a dedicated evaluation suite, RedSage-Bench, that pairs 30 thousand multiple-choice items with 240 open questions so people can measure knowledge, practical skills and tool familiarity in one place.

What the work shows

At the 8B parameter scale the RedSage variants outperform baseline open models by measurable margins, improving cybersecurity benchmarks by up to 5.59 points and general open LLM tasks by about 5.05 points. A variant aligned with Direct Preference Optimisation (DPO) shows stronger free-form performance than its unaligned sibling. The pipeline starts from a Qwen3 8B base, continues pretraining on CyberFineWeb, and then applies supervised fine-tuning with the augmented conversations. The artefact set, including models and data, is public and designed to run on consumer-grade GPUs for local, privacy-preserving deployment.

Where the gains matter in practice

Domain-aware pretraining plus workflow simulations delivers what many teams actually want: better answers about frameworks, tool usage and common offensive techniques, and text that follows security playbooks more reliably than a generalist model. That helps with triage, report drafting and scripted simulations. It is not a magic wand — RedSage improves accuracy and instruction following, but does not remove the need for human verification, especially when outputs drive automated actions.

Risks and a very practical to-do list

The same specificity that helps defenders also helps attackers. The paper calls out dual-use concerns, possible data leakage from training sources, susceptibility to prompt injection, and the risk of poisoned or biased public data seeping in through open pipelines. Those are not abstract worries. They translate to real operational hazards if an organisation allows an assistant to execute commands, expose sensitive logs, or accept unvetted prompts.

Teams thinking of deploying RedSage or similar agents should treat them like any other security service: restrict capabilities, instrument heavily, and assume compromise is possible. Practical measures include sandboxed tool execution with strict whitelists, runtime monitoring and safe-fail behaviours that stop rather than act on ambiguous requests, provenance controls and content filtering for training and inference data, thorough red-teaming focused on jailbreaks and adversarial prompts, and tight access controls for local deployments. Log everything relevant and ensure you can revoke network or tool privileges quickly.

In short, RedSage is a useful step toward practical homeland cybersecurity assistants, but it also crystallises a recurring pattern from past technology cycles: domain tuning accelerates usefulness and risk in equal measure. The sensible response is not to avoid these tools, but to deploy them with engineering discipline, layered controls and continuous adversarial testing.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

RedSage: A Cybersecurity Generalist LLM

Authors: Naufal Suryanto, Muzammal Naseer, Pengfei Li, Syed Talal Wasim, Jinhui Yi, Juergen Gall, Paolo Ceravolo, and Ernesto Damiani
Cybersecurity operations demand assistant LLMs that support diverse workflows without exposing sensitive data. Existing solutions either rely on proprietary APIs with privacy risks or on open models lacking domain adaptation. To bridge this gap, we curate 11.8B tokens of cybersecurity-focused continual pretraining data via large-scale web filtering and manual collection of high-quality resources, spanning 28.6K documents across frameworks, offensive techniques, and security tools. Building on this, we design an agentic augmentation pipeline that simulates expert workflows to generate 266K multi-turn cybersecurity samples for supervised fine-tuning. Combined with general open-source LLM data, these resources enable the training of RedSage, an open-source, locally deployable cybersecurity assistant with domain-aware pretraining and post-training. To rigorously evaluate the models, we introduce RedSage-Bench, a benchmark with 30K multiple-choice and 240 open-ended Q&A items covering cybersecurity knowledge, skills, and tool expertise. RedSage is further evaluated on established cybersecurity benchmarks (e.g., CTI-Bench, CyberMetric, SECURE) and general LLM benchmarks to assess broader generalization. At the 8B scale, RedSage achieves consistently better results, surpassing the baseline models by up to +5.59 points on cybersecurity benchmarks and +5.05 points on Open LLM Leaderboard tasks. These findings demonstrate that domain-aware agentic augmentation and pre/post-training can not only enhance cybersecurity-specific expertise but also help to improve general reasoning and instruction-following. All models, datasets, and code are publicly available.

🔍 ShortSpan Analysis of the Paper

Problem

Cybersecurity operations require assistant LLMs that can support diverse workflows while protecting sensitive data. Existing options rely on proprietary cloud APIs with privacy risks or open models lacking domain adaptation. There is a need for an open, locally deployable cybersecurity assistant trained on security workflows and tools, complemented by a dedicated benchmark to evaluate domain knowledge, practical skills and tool proficiency.

Approach

RedSage addresses this with a data centred pipeline beginning with large scale cybersecurity filtering of FineWeb to create CyberFineWeb, yielding a deduplicated corpus of about 11.7 billion tokens across 13 million documents after global deduplication. It combines CyberFineWeb with a curated seed set RedSage Seed containing 28 637 samples across knowledge, skills and tools totalling around 0.15 billion tokens, plus RedSage Dump of 459 000 documents and about 0.7 billion tokens to broaden coverage. Agentic augmentation then converts Seed into 266 000 multi turn conversations totalling around 352 million tokens, forming RedSage Conv for supervised fine tuning. General instruction data from SmolLM3 and other sources supplements domain data to support broad instruction following. The model training uses the Axolotl framework starting from Qwen3 8B base with continued pretraining on CyberFineWeb, followed by augmentation to RedSage Seed and Dump, and then supervised fine tuning on RedSage Conv together with SmolTalk2 data and a Direct Preference Optimisation alignment using the Tulu3 8B mixture. Final models include RedSage 8B variants with and without alignment. A dedicated benchmark, RedSage Bench, combines 30 000 multiple choice items and 240 open ended questions spanning knowledge, skills and tool use. The work is evaluated against established cybersecurity benchmarks and general open LLM benchmarks to assess broad generalisation.

Key Findings

  • At the 8B scale RedSage achieves consistently better results than baseline models, surpassing them by up to 5 59 points on cybersecurity benchmarks and 5 05 points on Open LLM Leaderboard tasks. The 8B variant with direct preference optimisation shows strong performance advantages, with RedSage 8B DPO outperforming the 8B non aligned competitor by notable margins in free form evaluation.
  • RedSage Bench extends the evaluation landscape by jointly assessing knowledge, skills and tool proficiency with 30 000 MCQs and 240 open ended questions, and results indicate high accuracy on MCQs and strong quality in open ended responses across cybersecurity knowledge, practical skills and tool usage categories.
  • The combination of domain specific pretraining on CyberFineWeb and seed based agentic augmentation yields complementary strengths that enable both robust cybersecurity reasoning and general instruction following. RedSage demonstrates transferability to larger models and supports local, privacy preserving deployment on consumer grade GPUs, while maintaining competitive performance on general benchmarks.

Limitations

Potential biases and inaccuracies may be propagated through the data curation and augmentation pipeline despite screening. The dual use nature of cybersecurity knowledge raises risks of misuse, including prompt injections, data leakage from training data and potential data or model poisoning given open source materials and pipelines. The work acknowledges copyright considerations for some seed and dump materials and notes that public releases will exclude copyrighted content unless permissions are obtained. Safe deployment requires robust safety rails, red teaming, provenance controls, access restrictions for local deployments and runtime monitoring with fail safes.

Why It Matters

The project delivers an open, locally deployable cybersecurity assistant tailored to security workflows and tools, accompanied by a comprehensive benchmark that jointly assesses knowledge, skills and tool proficiency. Practically, this supports defensive analysis, incident response and automated security reasoning, while also permitting attacker simulations in controlled research settings. The dual use of the technology necessitates careful safety engineering and governance in real world deployment, particularly for critical infrastructure contexts. The work demonstrates that domain aware pretraining and agentic augmentation can improve cybersecurity expertise and extend general reasoning, with open data and models designed to foster reproducibility and community benefit.


Related Articles

Related Research on arXiv

Get the Monthly AI Security Digest

Top research and analysis delivered to your inbox once a month. No spam, unsubscribe anytime.

Subscribe