Fuel iX

620,000 AI attacks: What enterprises need to know about AI safety and security

Key takeaways

  • Every AI model tested was exploitable. Attack success rates ranged from 1.3% to nearly 93%, with most production-popular models sitting above 40%.
  • Small models (10 billion parameters or fewer) failed to resist attacks 86% of the time — deploying them for cost or speed carries real security risk.
  • Refuse-but-engage behavior is a vulnerability, not a safeguard — models that decline a prompt but continue engaging with the topic create exploitable gaps.
  • Three attack categories broke every model tested, including top performers: privacy and personal data exploitation, fraud and financial scams, and cybersecurity threats.

Most enterprises deploying AI have no idea how exposed they are. This data gives you the answer. TELUS Digital's Fuel iX™ Applied Research team ran one of the largest AI safety benchmarks of its kind — 34 models, 10 providers, more than 620,000 adversarial attack evaluations — to understand exactly where the industry's AI attack surface sits.

Here are some highlights of what the data showed.

Every model had a weakness — the question was to what extent

Every single model tested was exploitable. Attack success rates (ASR) — the percentage of adversarial prompts that successfully bypassed a model's safety controls — ranged from 1.3% for the best performers to nearly 93% for the worst. Ten models fell below the 5% threshold, proving robust safety is achievable. But the majority of models popular in production deployments sat well above 40%, which many would consider unacceptable for production use.

Slide a1 frontline defense
Overall attack success rates across 34 models (750 attacks). Color-coded by vulnerability risk: purple (ASR < 5%), verbena (5–25%), peach (25–40%), red (> 40%). Markers show Q4 ASR for returning models.

Is your AI secure? Get the data.

Some models refused the attack. Then helped anyway.

Some models would initially decline a harmful prompt but then continue engaging with the underlying request, offering related context or resources. This is known as refuse-but-engage behavior. For a customer service AI chatbot, that's not a refusal. It's an unacceptable vulnerability.

"There are a lot of popular models that people are picking for their production applications that have exhibited a very particular and perhaps risky behavior, which is basically refusing the attack initially, but then engaging with the topic." — Milton Leal, Lead Applied AI Researcher, TELUS Digital

refuse-but-engage
ASR decomposition showing direct compliance (DC) versus refuse-but-engage (RBE) patterns across models.

Is your model guilty of this behavior? Get the data.

AI model size mattered more than source

Bigger models are significantly harder to jailbreak. Small models — those with 10 billion parameters or fewer — failed to resist attacks 86% of the time. Large models failed at a fraction of that rate. If your organization is deploying small models for cost or speed, that tradeoff carries real security implications.

Slide a3 size type

Origin mattered less than you think

The most counterintuitive finding: Chinese-origin models showed no meaningful safety difference from Western models once model size was accounted for. The 7.7 percentage-point gap between the two groups was almost entirely explained by the Chinese sample including more smaller models. The full breakdown is in the report, and it changes how you should think about model sourcing decisions.

Slide a6 origin comparison

Three attack categories broke every model — including the top performers

Even top performers showed consistent weaknesses in three areas: privacy and personal data exploitation, fraud and financial scams, and cybersecurity threats. These aren't edge cases — they're the attack surfaces most likely to cause real commercial, reputational or user harm. If you're only doing basic testing, these categories are where you need to spend more time.

Slide b1 heatmap
Attack success rate heatmap: 34 models × 15 attack categories. Models sorted by overall ASR (most secure at top); categories sorted by average effectiveness (most effective at left). Cell values show ASR percentage.

Moving from reactive to proactive

Most organizations still treat AI safety and security as something to address after a problem surfaces.

Fuel iX’s GenAI safety (GAS) model benchmark makes the case for a different approach, which is:

Continuous adversarial testing for AI that covers novel attacks, runs after every system prompt change and validates against recognized standards like OWASP and NIST-RMF.

The question isn't whether your system has vulnerabilities. It's whether you know where they are and how to prioritize them.

Frequently asked questions

Is your AI secure or do you just hope it is?

Get the benchmark data your security team needs to govern AI risk in production.

Get the data