Case Studies

When Adding Skills Made Our DAST Agent Worse

We added 8 structured skills to SecureVibes' DAST agent expecting better results. Flag extraction dropped from 75% to 12.5%. Here's what went wrong.

Harish Kolla
12 min read
When Adding Skills Made Our DAST Agent Worse

We added 8 structured skills to SecureVibes' DAST agent expecting better, more consistent results. Flag extraction dropped from 75% to 12.5%. This is the story of what went wrong, what we learned, and why we're moving away from skills for well-documented vulnerability classes.

6x Performance Drop8 DAST Skills TestedHaiku: 0% → 100%

Context

How SecureVibes' DAST agent fits into the pipeline

SecureVibes runs a 5-agent pipeline where each phase builds on the previous one's output. The optional DAST agent (Phase 4) takes vulnerabilities discovered during static analysis and validates them by crafting real HTTP requests against a live application instance.

When the DAST agent was first built without any skills or constraints, it worked well but was dangerously aggressive: creating files, attempting direct database access, modifying application state. Without a sandbox, the team turned to prompt-level constraints to control this behavior. That decision is central to this case study.

SecureVibes 5-phase pipeline diagram showing Assessment, Threat Modeling, Code Review, DAST, and Report phases

What We Added

8 structured DAST skills covering common vulnerability classes

Skills are structured markdown files that provide Claude with domain-specific methodology at runtime. We've already seen skills work well for threat modeling because they help the agent detect agentic application patterns and apply OWASP ASI categories that it might not prioritize on its own. Based on that success, we built 8 DAST-specific skills expecting similar results.

The key difference: threat modeling skills inject knowledge the model doesn't have (organization-specific threat categories, agentic detection patterns). The DAST skills codified knowledge the model already has (SQL injection testing, XSS payloads, OWASP Top 10 methodology). That distinction turned out to matter enormously.

The 8 DAST skills covered SQL injection, XSS, authorization testing, command injection, XXE, SSRF, NoSQL injection, and general injection patterns (SSTI, LDAP, XPath, CRLF). Each was 250–460 lines of structured methodology with real payloads, database-specific notes, CWE mappings, and evidence capture formats.

Benchmarking Setup

Automated benchmark runner with planted flags

An automated benchmark runner tests the DAST agent against known-vulnerable applications with planted flags. Each benchmark runs three configurations:

Config 1

Base SAST scan

No DAST — generates baseline findings

Config 2

DAST with skills

Validates findings with skills loaded

Config 3

DAST without skills

Skills removed, native capability only

Test Targets

XBEN-078-24Medium

SQL Injection

XBEN-087-24Hard

Cross-Site Scripting

Models tested: Sonnet and Haiku

Results: A 6x Performance Drop

Adding skills made the agent dramatically worse

Metric
With Skills
Without Skills
Flag extraction rate
12.5% (1/8)
75% (6/8)
Vulnerabilities tested
75%
100%
Evidence quality
Summary-level
Full HTTP request/response

The most striking result was Haiku: 0% flag extraction with skills, 100% without.

The smallest model tested went from complete failure to perfect success simply by removing skills.

Bar chart comparing flag extraction rates with and without skills across Sonnet and Haiku models

Observed Behavioral Differences

With Skills Loaded

  • Agent loads the matching SKILL.md (~400 lines of methodology)
  • Follows the prescribed testing recipe step by step
  • Produces structured but summary-level evidence
  • Some vulnerabilities marked UNVALIDATED because "no matching skill"

Without Skills

  • Agent goes straight to crafting curl commands
  • Improvises payload selection based on target responses, noticing verbose error messages and immediately pivoting to exploit them
  • Creates ephemeral Python scripts in /tmp to compute expected values for blind injection timing
  • Chains multiple requests to build application state before triggering the actual vulnerability
  • Captures and includes full HTTP request/response pairs
  • Tests every vulnerability, adapting approach on the fly

The unconstrained agent was doing things no skill author would have thought to include in a methodology document. The skills became a ceiling on capability, not a floor.

Root Cause Analysis

Three prompt-level patterns that reinforced each other

1

The Hard Gate (Primary Cause)

This was the gating trap, and in hindsight, it was almost inevitable. The moment you have skills, the natural instinct is to tie them to execution: “use the matching skill for each vulnerability type.” From there, it's a short slide to “only test when a skill exists.” It feels like good engineering (structured, predictable, auditable), but it turns the agent from a security tester into a skill execution engine.

dast.txt
### 2. Validation Eligibility (Hard Gate)
- You MUST only attempt validation when a matching skill exists
  and you can load its SKILL.md.
- If you cannot activate a relevant skill for a given vulnerability
  type/CWE, do NOT attempt ad-hoc validation.
- In that case, mark the item as UNVALIDATED with reason:
  "No applicable validation skill".

This turned skills from helpful reference material into a mandatory gateway. The agent's decision tree collapsed to:

Does a skill exist for this CWE?
→ Yes → Load skill, follow its recipe
→ No  → Mark UNVALIDATED, skip entirely

There was no path for “use your own security knowledge.” The model's native expertise was walled off behind a skill-existence check.

Decision tree comparing gating approach (skills required) vs additive approach (skills as optional reference)
2

The Agent Self-Description

definitions.py
"dast": AgentDefinition(
    description="Validates vulnerabilities via HTTP testing ONLY when a
    matching Agent Skill is available; otherwise reports UNVALIDATED",
)

The agent's self-concept was defined as “skill executor” rather than “security tester.” This shaped every downstream decision. Without skills loaded, the description became incoherent, and the model reverted to what it actually is: a capable security testing agent.

3

Skill-Gated File Creation

Do NOT create arbitrary code files (e.g., *.py, *.sh) in the project unless explicitly instructed by a loaded SKILL.md.

This prevented the agent from creating ephemeral helper scripts unless a skill explicitly called for them. Without this constraint, the agent freely wrote temporary scripts to /tmp when needed, making it more effective at complex validations.

Why These Constraints Existed

These weren't arbitrary. Without a sandbox, the unconstrained DAST agent would:

  • Create and delete files across the project
  • Attempt direct database access
  • Modify application configuration
  • Discover and test vulnerabilities beyond the original findings list

Prompt constraints were the safety mechanism available at the time. They achieved safety at the direct cost of capability.

Think of it like hiring a pentester: giving them a checklist and saying “only test what's on this list, don't use tools that aren't approved” produces surface-level findings. Putting them in an isolated test environment and saying “find everything you can” produces a comprehensive report. The first pentester is constrained by rules they have to keep in mind while working. The second is constrained by the environment but mentally free. Same with agents.

Why Haiku Showed the Largest Swing

Haiku has a smaller context window and less capacity for nuanced instruction interpretation. With skills loaded, the hard gate rules plus ~400 lines of methodology consumed a significant fraction of Haiku's effective working context. Without skills, that context was entirely available for actual testing. For smaller models, every token of constraint overhead is a token not available for real work.

The Decision: Moving Away from Skills

For well-documented vulnerability classes, modern LLMs don't need skills

State-of-the-art models like Claude Sonnet and even Haiku have extensive security testing knowledge from training data. The OWASP Top 10, common injection patterns, XSS payloads, authorization bypass techniques—all of this is thoroughly covered in the model's training corpus. Skills that codify this same knowledge add marginal value at best, and as the benchmarks showed, can actively degrade performance when the surrounding prompt architecture turns them into constraints.

The question to ask before writing any skill: “If the agent had no skill file, would it know how to do this?” For SQL injection testing, the answer was obviously yes. For detecting agentic application patterns and applying OWASP ASI categories, the answer is no—and that's where skills earn their keep.

What Changed

4 files, 16 insertions, 19 deletions.

DAST prompt change
# Before
### 2. Validation Eligibility (Hard Gate)
- You MUST only attempt validation when a matching skill exists

# After
### 2. Validate Each Vulnerability
- For each vulnerability, craft and execute HTTP-based tests
- Use your security testing expertise to select appropriate payloads
- If skills are available in .claude/skills/dast/, you may consult them
  as reference methodologies, but you are not limited to them
Agent description change
# Before
description="Validates vulnerabilities via HTTP testing ONLY when a matching
Agent Skill is available; otherwise reports UNVALIDATED"

# After
description="Validates vulnerabilities via HTTP-based dynamic testing against
the live application, using security expertise and available skills as reference"
File writing rules change
# Before
- Do NOT create arbitrary code files unless explicitly instructed
  by a loaded SKILL.md

# After
- Do NOT create code files in the project directory. If you need helper
  scripts, write them to /tmp and delete them after use.

All existing safety mechanisms (HTTP-only testing, database tool blocking via hooks, output format validation, file write restrictions, production URL detection) remained intact. The safety boundary stayed; only the capability constraints were removed.

When Skills Still Make Sense

This isn't a blanket argument against skills

Domain-specific knowledge

High

Internal auth patterns, proprietary APIs, custom protocols. The model can’t learn this from training

Organization-specific methodology

High

Your compliance requirements, your naming conventions, your risk scoring rubric

Emerging vulnerability classes

Medium

Recent CVEs or attack patterns that may not be well-represented in training data

Well-documented OWASP Top 10

Low

The model already knows SQLi, XSS, IDOR, command injection thoroughly

Standard security testing methodology

Low

PTES, OWASP Testing Guide. The model already knows these

The question to ask before creating a skill: “Does this skill contain knowledge the model genuinely doesn't have?” If the answer is no, the skill adds overhead without proportional value.

Recommendations

For teams building agentic systems with the Claude Agent SDK

1

Benchmark Before and After

Run identical tests with and without skills. If performance doesn’t improve, the skills aren’t helping. If it degrades, there’s a prompt architecture problem.

2

Never Gate Agency on Skill Existence

Skills should be additive reference material, never a prerequisite for action.

# Do this
Skills may be available as reference. Use your expertise for all tasks.

# Not this
You MUST only attempt tasks when a matching skill exists.
3

Watch the Agent’s Self-Description

How you describe the agent to itself shapes its behavior. "Security tester with optional references" produces fundamentally different behavior than "skill executor."

4

Account for Model Size

Smaller models are disproportionately affected by prompt overhead. If skills consume 400 lines of context in a model with limited working memory, the net effect may be negative even if the skill content is high quality.

5

Separate Safety from Capability

Prompt-level constraints that prevent dangerous actions also prevent effective actions. Move safety to the environment layer (sandboxes, hooks, tool restrictions) and keep prompts focused on task definition and quality standards.

What's Next: Sandbox-First Safety

Moving safety from prompts to the environment

The deeper lesson from this work extends beyond skills. Constraining agent behavior through prompts—telling the agent what not to do—is fundamentally at odds with agent capability. Every constraint in the prompt is context consumed, a decision path closed, a creative solution blocked.

The alternative is environment-level safety: run the agent in a sandbox where it can try anything, and let the environment prevent harm. The agent doesn't even know it's constrained. It gets full access to its own capability while the sandbox handles the safety boundary.

SecureVibes is moving toward this architecture with Docker-based isolation for the DAST phase where the agent can operate freely within a controlled environment.

Skills should guide LLMs, not restrict them.

For domains where the model already has deep knowledge, even the guidance adds little.

SecureVibes is an open-source AI-powered application security scanner built on the Claude Agent SDK. Features · How It Works · GitHub