Manual AI red teaming that scales: Introducing Attack Playground

Key takeaways
- Automated AI red teaming scales but cannot replicate the domain expertise and intuition of experienced red teamers — and most platforms force you to choose one or the other.
- Attack Playground gives security teams exploratory control inside the same Fortify interface that handles automated coverage, with real-time severity decisions from the Fortify Judge.
- The EU AI Act and NIST AI RMF require documented evidence of both automated and human-led adversarial testing — not one or the other.
- Any prompt that finds a vulnerability can be saved to an attack library in one action, turning a one-time human discovery into permanent automated regression coverage.
Fuel iX™ Fortify 14.9 introduces Attack Playground, enabling security teams to manually craft and test their own attack prompts against AI targets and get instant feedback on the results.
No more waiting for automated sessions to complete. Test an idea, see if it works and if it uncovers a vulnerability, save it to your attack library to run automatically from that point forward.
Automated AI security testing scales, but human intuition is irreplaceable
Automated red teaming catches a lot. It runs thousands of attacks in hours, maps findings to OWASP LLM Top 10, NIST AI RMF and MITRE ATLAS and gives security teams the coverage they need to move fast.
But it cannot replace human intuition.
Experienced red and purple teamers know things that automation does not. They understand the quirks of a specific chatbot. They spot edge cases that only surface when you push a model in a direction no automated workflow would try. They build hypotheses from domain expertise — the kind of context a generalized attack generator cannot replicate.
Until now, most AI security platforms have forced teams to pick a side.
Automated tools scale but leave red teamers without exploratory control. Manual tools give control but do not scale and do not feed back into continuous coverage.
How human expertise amplifies automated coverage
Enterprise AI deployments are multiplying faster than security teams can keep up. The EU AI Act is in force and high-risk AI system requirements take full effect in August 2026 — giving organizations a shrinking window to get testing documentation in order. Auditors and regulators increasingly expect to see both automated testing logs and human-led adversarial reviews — not one or the other.
At the same time, AI/ML engineers are shipping model updates, prompt changes and new integrations on weekly or even daily cycles. Every change is a potential new vulnerability. Regression testing that previously identified weaknesses has become a baseline expectation, not a nice-to-have.
Security teams need a way to do targeted, human-guided testing without leaving the platform that already handles their automated coverage. That is what Attack Playground in Fortify delivers.
Real-time Judge feedback turns manual testing into a fast, iterative loop
Attack Playground lets users manually craft and execute custom attack prompts against any LLM target configured in Fortify. For every interaction, the Judge returns one of four outcomes: a safe decision or a vulnerable decision paired with severity (low, medium, high or critical) and the reasoning behind the call. No waiting for a full automated session to complete. Red teamers can adjust their approach on the next prompt, not the next session.
Attack Playground supports both HTTP and SDK integration methods, covering the full architecture range of enterprise LLM deployments. For SDK-integrated targets, users can choose between single-shot and multi-turn conversation modes, matching the testing approach to the threat model. Sessions can be resumed or reviewed from the history page and automatically end after 10 minutes of inactivity.

A "10-minute inactivity" timer begins until the user starts a conversation. This ensures to keep the sessions clean. An "End Session" button is also provided at the top-right side of the screen for the users to stop the existing Playground session.
The manual-to-automated feedback loop
This is where Attack Playground becomes infrastructure rather than a feature.
Any prompt a user runs in Attack Playground can be saved into any attack library — new or existing — with a single action. Once saved, that prompt immediately becomes part of the user's automated session coverage.
For teams who want automated red teaming running at the same cadence as their builds, CI/CD Integration in Fortify 14.8 closes the other half of that loop.
Human-found vulnerabilities become permanent automated coverage.

When a prompt uncovers a vulnerability, saving it takes one action. From that point forward, it runs as a regression test against every future model update. A domain expert's creative attack stops being a one-time finding and becomes institutional knowledge the platform enforces automatically.
For red teamers, that changes the economics of manual testing. Time spent in Attack Playground doesn't disappear after the session ends — it compounds.
Which teams get the most value from Attack Playground
Three roles get the most immediate value.
Red team testers and AI security engineers get the exploratory control they have been asking for — without leaving the platform that handles their automated coverage. Ad-hoc vulnerability exploration, hypothesis-driven testing and context-specific attacks all happen in the same interface, with the same Judge and the same severity classification.
AI/ML engineers use Attack Playground for targeted regression testing after guardrail changes, prompt engineering updates or model retraining. They verify that a previously identified vulnerability stays closed, fast, before pushing to production.
GRC and compliance teams get something they have struggled to document: evidence of human-led adversarial testing alongside automated results. That combination is increasingly what regulators and internal audit functions want to see.
For teams deploying bilingual AI systems, that assurance check extends to language. Fortify's Canadian French attack generation ensures guardrails hold in the language your users actually speak.
Closing the loop between human and automated testing
Most red teaming programs have a ceiling. Automated coverage only catches what it was built to find. Manual testing only produces what a human thinks to try. Neither learns from the other.
Attack Playground removes that ceiling. Manual prompts feed into attack libraries. Automated findings show humans where to push next. The two halves reinforce each other — and the program gets sharper every time it runs.
NIST AI RMF and OWASP LLM Top 10 are built on that assumption: continuous monitoring, not one-time audits. That's the standard Attack Playground holds Fortify to.



