The AI Testing Crisis: Why Enterprises Can't Deploy AI Safely Without Continuous Red-Teaming

Enterprises deploying artificial intelligence agents face an unprecedented security challenge: these systems operate independently, interact with multiple external services, and often execute tasks without human oversight, creating an expanded attack surface that traditional security tools cannot protect. According to Cisco executives, the only viable defense is repeated adversarial testing of AI models as they evolve, since no standardized vulnerability database exists for AI systems yet.

What Makes AI Agents Different From Traditional Software?

Agentic AI introduces a fundamentally different operating model compared to conventional applications. These systems act persistently, meaning they continue operating over time rather than responding to single requests. They execute tasks independently without waiting for human approval at each step. Most critically, they interact with complex ecosystems of models, tools, and third-party services, often without human oversight.

This autonomous nature creates security vulnerabilities that didn't exist in traditional software. Attackers can exploit the expanded attack surface through multiple vectors, including prompt injection attacks, model-specific exploits, and compromised model repositories. As enterprises move from pilot projects to production deployments, ensuring security becomes critical for mission-critical applications.

Why Traditional Security Approaches Are Failing?

The fundamental problem is that AI security lacks the infrastructure that protects traditional software. For decades, cybersecurity teams have relied on vulnerability databases like the National Vulnerability Database (NVD) to identify known risks in software libraries and frameworks. No equivalent database exists for AI models and agents.

"The only option that you have is to be able to adversarial test these models over and over again," said DJ Sampath, Senior Vice President of AI Software and Platform at Cisco.

DJ Sampath, Senior Vice President of AI Software and Platform at Cisco

This absence of a standardized vulnerability database means enterprises cannot rely on published threat intelligence to protect their AI deployments. Instead, they must continuously test their models against potential attacks themselves, simulating the tactics that real adversaries might use. This red-teaming approach requires ongoing investment and expertise that many organizations are still developing.

Steps to Implement Continuous Adversarial Testing for AI Models

  • Inventory Your AI Assets: Before deploying security controls, enterprises need a complete inventory of AI assets across cloud and on-premises environments. This foundational step ensures you know what systems require protection and where they operate.
  • Establish Repeated Red-Teaming Cycles: Conduct adversarial testing regularly as models evolve and new attack vectors emerge. This is not a one-time security audit but an ongoing process that must continue throughout the model's operational lifetime.
  • Deploy Runtime Guardrails: Implement runtime guardrails that help defend against agentic attacks executed at scale. These protective measures monitor AI system behavior in real-time and can intervene when suspicious activity is detected.

How Will the AI Developer Explosion Change Security Requirements?

The scale of this challenge is about to expand dramatically. Currently, approximately 150 million developers worldwide write code. As AI tools and agents enable nontechnical users to build software, that number could grow to between 3 billion and 4 billion people writing code. This explosion of AI-enabled development means exponentially more AI models and agents will be deployed across enterprises, each requiring security testing and monitoring.

Sampath noted that enterprises must establish an inventory of AI assets across cloud and on-premises environments before deploying security controls. Without this foundational visibility, organizations cannot effectively protect their AI infrastructure or identify which systems pose the greatest risk.

The security implications are substantial. More developers means more AI models in production. More models in production means a larger attack surface for adversaries to target. More attack surface means security teams must scale their adversarial testing capabilities significantly to keep pace with the growth in AI deployments.

What Role Do Runtime Guardrails Play in AI Defense?

Runtime guardrails represent a critical layer of defense for AI systems operating at scale. Unlike traditional firewalls that inspect network traffic, runtime guardrails monitor the behavior of AI agents as they execute tasks. They can detect when an AI system is attempting to perform unauthorized actions, accessing restricted data, or behaving in ways inconsistent with its intended purpose.

These guardrails are particularly important because agentic AI systems operate with minimal human oversight. A human reviewing every action an AI agent takes would defeat the purpose of automation. Instead, runtime guardrails provide automated oversight, allowing AI systems to operate efficiently while maintaining security boundaries. This approach enables enterprises to scale AI deployments without proportionally increasing their security team size.

The challenge ahead is clear: as enterprises move from AI pilots to production deployments, and as the number of AI developers grows exponentially, security must become a core part of the development process rather than an afterthought. Continuous adversarial testing, comprehensive asset inventories, and runtime guardrails represent the emerging best practices for organizations navigating this new landscape.