Cisco report: No leading AI model is safe from multi-turn attacks - GPT-5.4 fails nine times more often than in standard tests
What it really says
Cisco's threat intelligence team has systematically tested the security of 15 leading AI models from OpenAI, Anthropic, Google, Amazon, and xAI as part of its 'State of AI Security 2026' report. The central finding: not a single model is immune to so-called multi-turn attacks, where an attacker iteratively works across multiple conversation rounds, adapting prompts based on each response until the model abandons its safety measures. The researchers tested approximately 30,000 single prompts and nearly 7,000 multi-turn attacks across more than 1,400 conversations. OpenAI's GPT-5.4, for example, jumped from an attack success rate of 2.74 percent on single prompts to 24.68 percent on multi-turn attacks - a ninefold increase. The highest observed success rate was 88 percent. The study demonstrates that the industry-standard safety benchmarks, which typically use only single prompts, systematically underestimate the actual vulnerability of the models.
Our assessment
This study deserves a 'differentiated consideration' rating because it reveals a real but context-dependent problem. The good news first: the models defend against simple single attacks very effectively - GPT-5.4 blocks 97 percent. This means a casual malicious prompt will fail in the vast majority of cases. The bad news: a motivated attacker willing to invest multiple rounds can bypass safety mechanisms with an alarmingly high success rate. The discrepancy between published safety benchmarks and real-world vulnerability under practical conditions is the truly concerning finding. Organizations deploying AI models in security-critical areas cannot rely solely on built-in safety mechanisms. Cisco's recommendation to implement additional protection layers at the application level is sensible. For end users, this means: the AI models you use daily have safety mechanisms that work well against casual misuse - but they are not impenetrable walls.
Relevance for Germany
For German companies, this study has immediate significance. According to Bitkom, 42 percent of German companies already use generative AI, many of them utilizing the models tested in this study from OpenAI, Google, and Anthropic. The AI Act requires providers of high-risk AI systems to conduct comprehensive security testing - the Cisco study shows that industry-standard single-prompt benchmarks are insufficient for this purpose. Companies deploying AI models in customer-facing applications - such as customer service or consulting - must anticipate multi-turn attack scenarios and implement corresponding safeguards at the application level. The BSI, as the designated market surveillance authority, has already announced its own testing procedures for AI systems. The Cisco findings could help ensure these procedures also account for multi-turn attack scenarios - an important step for AI security in Germany.
Fact check
The primary source is the official Cisco blog post for the 'State of AI Security 2026' report along with the accompanying technical blog post 'Proprietary Problems'. The specific numbers - GPT-5.4 rising from 2.74 to 24.68 percent attack success rate, approximately 30,000 single prompts and 7,000 multi-turn attacks across 1,400 conversations - are consistently reported by SiliconANGLE, Help Net Security, CSO Online, CIO Dive, and Computer Weekly. The 88 percent as the highest observed multi-turn success rate comes from the Cisco report itself. The 15 models tested came from five providers: OpenAI, Anthropic, Google, Amazon, and xAI.
Source
- • https://blogs.cisco.com/ai/cisco-state-of-ai-security-2026-report
- • https://blogs.cisco.com/ai/proprietary-problems
- • https://siliconangle.com/2026/05/27/cisco-report-finds-no-closed-frontier-ai-model-safe-multi-turn-attacks/
- • https://www.helpnetsecurity.com/2026/05/28/cisco-multi-turn-ai-attacks/