When Adding Skills Made Our DAST Agent Worse
We added 8 structured skills to SecureVibes' DAST agent expecting better results. Flag extraction dropped from 75% to 12.5%. Here's what went wrong.

We added 8 structured skills to SecureVibes' DAST agent expecting better, more consistent results. Flag extraction dropped from 75% to 12.5%. This is the story of what went wrong, what we learned, and why we're moving away from skills for well-documented vulnerability classes.
Context
How SecureVibes' DAST agent fits into the pipeline
SecureVibes runs a 5-agent pipeline where each phase builds on the previous one's output. The optional DAST agent (Phase 4) takes vulnerabilities discovered during static analysis and validates them by crafting real HTTP requests against a live application instance.
When the DAST agent was first built without any skills or constraints, it worked well but was dangerously aggressive: creating files, attempting direct database access, modifying application state. Without a sandbox, the team turned to prompt-level constraints to control this behavior. That decision is central to this case study.

What We Added
8 structured DAST skills covering common vulnerability classes
Skills are structured markdown files that provide Claude with domain-specific methodology at runtime. We've already seen skills work well for threat modeling because they help the agent detect agentic application patterns and apply OWASP ASI categories that it might not prioritize on its own. Based on that success, we built 8 DAST-specific skills expecting similar results.
The key difference: threat modeling skills inject knowledge the model doesn't have (organization-specific threat categories, agentic detection patterns). The DAST skills codified knowledge the model already has (SQL injection testing, XSS payloads, OWASP Top 10 methodology). That distinction turned out to matter enormously.
The 8 DAST skills covered SQL injection, XSS, authorization testing, command injection, XXE, SSRF, NoSQL injection, and general injection patterns (SSTI, LDAP, XPath, CRLF). Each was 250–460 lines of structured methodology with real payloads, database-specific notes, CWE mappings, and evidence capture formats.
Benchmarking Setup
Automated benchmark runner with planted flags
An automated benchmark runner tests the DAST agent against known-vulnerable applications with planted flags. Each benchmark runs three configurations:
Base SAST scan
No DAST — generates baseline findings
DAST with skills
Validates findings with skills loaded
DAST without skills
Skills removed, native capability only
Test Targets
XBEN-078-24MediumSQL Injection
XBEN-087-24HardCross-Site Scripting
Models tested: Sonnet and Haiku
Results: A 6x Performance Drop
Adding skills made the agent dramatically worse
The most striking result was Haiku: 0% flag extraction with skills, 100% without.
The smallest model tested went from complete failure to perfect success simply by removing skills.

Observed Behavioral Differences
With Skills Loaded
- •Agent loads the matching SKILL.md (~400 lines of methodology)
- •Follows the prescribed testing recipe step by step
- •Produces structured but summary-level evidence
- •Some vulnerabilities marked UNVALIDATED because "no matching skill"
Without Skills
- •Agent goes straight to crafting curl commands
- •Improvises payload selection based on target responses, noticing verbose error messages and immediately pivoting to exploit them
- •Creates ephemeral Python scripts in /tmp to compute expected values for blind injection timing
- •Chains multiple requests to build application state before triggering the actual vulnerability
- •Captures and includes full HTTP request/response pairs
- •Tests every vulnerability, adapting approach on the fly
The unconstrained agent was doing things no skill author would have thought to include in a methodology document. The skills became a ceiling on capability, not a floor.
Root Cause Analysis
Three prompt-level patterns that reinforced each other
The Hard Gate (Primary Cause)
This was the gating trap, and in hindsight, it was almost inevitable. The moment you have skills, the natural instinct is to tie them to execution: “use the matching skill for each vulnerability type.” From there, it's a short slide to “only test when a skill exists.” It feels like good engineering (structured, predictable, auditable), but it turns the agent from a security tester into a skill execution engine.
### 2. Validation Eligibility (Hard Gate)
- You MUST only attempt validation when a matching skill exists
and you can load its SKILL.md.
- If you cannot activate a relevant skill for a given vulnerability
type/CWE, do NOT attempt ad-hoc validation.
- In that case, mark the item as UNVALIDATED with reason:
"No applicable validation skill".This turned skills from helpful reference material into a mandatory gateway. The agent's decision tree collapsed to:
There was no path for “use your own security knowledge.” The model's native expertise was walled off behind a skill-existence check.

The Agent Self-Description
"dast": AgentDefinition(
description="Validates vulnerabilities via HTTP testing ONLY when a
matching Agent Skill is available; otherwise reports UNVALIDATED",
)The agent's self-concept was defined as “skill executor” rather than “security tester.” This shaped every downstream decision. Without skills loaded, the description became incoherent, and the model reverted to what it actually is: a capable security testing agent.
Skill-Gated File Creation
This prevented the agent from creating ephemeral helper scripts unless a skill explicitly called for them. Without this constraint, the agent freely wrote temporary scripts to /tmp when needed, making it more effective at complex validations.
Why These Constraints Existed
These weren't arbitrary. Without a sandbox, the unconstrained DAST agent would:
- •Create and delete files across the project
- •Attempt direct database access
- •Modify application configuration
- •Discover and test vulnerabilities beyond the original findings list
Prompt constraints were the safety mechanism available at the time. They achieved safety at the direct cost of capability.
Think of it like hiring a pentester: giving them a checklist and saying “only test what's on this list, don't use tools that aren't approved” produces surface-level findings. Putting them in an isolated test environment and saying “find everything you can” produces a comprehensive report. The first pentester is constrained by rules they have to keep in mind while working. The second is constrained by the environment but mentally free. Same with agents.
Why Haiku Showed the Largest Swing
Haiku has a smaller context window and less capacity for nuanced instruction interpretation. With skills loaded, the hard gate rules plus ~400 lines of methodology consumed a significant fraction of Haiku's effective working context. Without skills, that context was entirely available for actual testing. For smaller models, every token of constraint overhead is a token not available for real work.
The Decision: Moving Away from Skills
For well-documented vulnerability classes, modern LLMs don't need skills
State-of-the-art models like Claude Sonnet and even Haiku have extensive security testing knowledge from training data. The OWASP Top 10, common injection patterns, XSS payloads, authorization bypass techniques—all of this is thoroughly covered in the model's training corpus. Skills that codify this same knowledge add marginal value at best, and as the benchmarks showed, can actively degrade performance when the surrounding prompt architecture turns them into constraints.
The question to ask before writing any skill: “If the agent had no skill file, would it know how to do this?” For SQL injection testing, the answer was obviously yes. For detecting agentic application patterns and applying OWASP ASI categories, the answer is no—and that's where skills earn their keep.
What Changed
4 files, 16 insertions, 19 deletions.
# Before
### 2. Validation Eligibility (Hard Gate)
- You MUST only attempt validation when a matching skill exists
# After
### 2. Validate Each Vulnerability
- For each vulnerability, craft and execute HTTP-based tests
- Use your security testing expertise to select appropriate payloads
- If skills are available in .claude/skills/dast/, you may consult them
as reference methodologies, but you are not limited to them# Before
description="Validates vulnerabilities via HTTP testing ONLY when a matching
Agent Skill is available; otherwise reports UNVALIDATED"
# After
description="Validates vulnerabilities via HTTP-based dynamic testing against
the live application, using security expertise and available skills as reference"# Before
- Do NOT create arbitrary code files unless explicitly instructed
by a loaded SKILL.md
# After
- Do NOT create code files in the project directory. If you need helper
scripts, write them to /tmp and delete them after use.All existing safety mechanisms (HTTP-only testing, database tool blocking via hooks, output format validation, file write restrictions, production URL detection) remained intact. The safety boundary stayed; only the capability constraints were removed.
When Skills Still Make Sense
This isn't a blanket argument against skills
Domain-specific knowledge
HighInternal auth patterns, proprietary APIs, custom protocols. The model can’t learn this from training
Organization-specific methodology
HighYour compliance requirements, your naming conventions, your risk scoring rubric
Emerging vulnerability classes
MediumRecent CVEs or attack patterns that may not be well-represented in training data
Well-documented OWASP Top 10
LowThe model already knows SQLi, XSS, IDOR, command injection thoroughly
Standard security testing methodology
LowPTES, OWASP Testing Guide. The model already knows these
The question to ask before creating a skill: “Does this skill contain knowledge the model genuinely doesn't have?” If the answer is no, the skill adds overhead without proportional value.
Recommendations
For teams building agentic systems with the Claude Agent SDK
Benchmark Before and After
Run identical tests with and without skills. If performance doesn’t improve, the skills aren’t helping. If it degrades, there’s a prompt architecture problem.
Never Gate Agency on Skill Existence
Skills should be additive reference material, never a prerequisite for action.
# Do this
Skills may be available as reference. Use your expertise for all tasks.
# Not this
You MUST only attempt tasks when a matching skill exists.Watch the Agent’s Self-Description
How you describe the agent to itself shapes its behavior. "Security tester with optional references" produces fundamentally different behavior than "skill executor."
Account for Model Size
Smaller models are disproportionately affected by prompt overhead. If skills consume 400 lines of context in a model with limited working memory, the net effect may be negative even if the skill content is high quality.
Separate Safety from Capability
Prompt-level constraints that prevent dangerous actions also prevent effective actions. Move safety to the environment layer (sandboxes, hooks, tool restrictions) and keep prompts focused on task definition and quality standards.
What's Next: Sandbox-First Safety
Moving safety from prompts to the environment
The deeper lesson from this work extends beyond skills. Constraining agent behavior through prompts—telling the agent what not to do—is fundamentally at odds with agent capability. Every constraint in the prompt is context consumed, a decision path closed, a creative solution blocked.
The alternative is environment-level safety: run the agent in a sandbox where it can try anything, and let the environment prevent harm. The agent doesn't even know it's constrained. It gets full access to its own capability while the sandbox handles the safety boundary.
SecureVibes is moving toward this architecture with Docker-based isolation for the DAST phase where the agent can operate freely within a controlled environment.
Skills should guide LLMs, not restrict them.
For domains where the model already has deep knowledge, even the guidance adds little.
SecureVibes is an open-source AI-powered application security scanner built on the Claude Agent SDK. Features · How It Works · GitHub