Logo
FrontierNews.ai

OpenAI Codex Now Scans Your Code for Security Holes While Building Features

OpenAI's Codex CLI, the AI coding agent that lives in your terminal, now integrates runtime security testing directly into its workflow. Developers can point Codex at their application, and the agent will scan for vulnerabilities, patch vulnerable code, and rescan to confirm fixes work,all as part of the same automated loop that builds features.

How Does Codex Security Testing Work?

Codex CLI operates by reading, modifying, and executing code in a directory you specify. Until now, the security verification step remained manual and slow. A new integration with StackHawk agent skills compresses the entire security workflow into five repeatable steps: configure scanning settings for your app type and authentication pattern, run a security scan against the live application, parse the findings, fix the vulnerable code, and rescan to verify the repairs.

The workflow eliminates the traditional handoff between development and security testing. Instead of building a feature, shipping it, and then discovering vulnerabilities later, Codex handles both the feature and the security check in one continuous process. This approach treats "done" as "done and secure," rather than leaving security verification for a separate phase.

What Are the Practical Steps to Set Up Codex Security Scanning?

  • Install Prerequisites: You'll need Codex CLI (available via npm or Homebrew), a StackHawk account on the Secure, Scale, or Wingman plan, Java 17 or later on Linux systems, and your application running locally on a port between 1024 and 65535.
  • Generate an API Key: Log into the StackHawk console, navigate to Settings, then API Keys, and create a new key labeled for Codex integration. Copy the key immediately, as you cannot retrieve it later without deleting and recreating it.
  • Install StackHawk Tools: On macOS or Linux, run a single Homebrew command to install both the hawk scanning tool and hawkop reporting tool, then initialize each with your API key and organization settings.
  • Add StackHawk Skills to Codex: Register StackHawk's marketplace in Codex, then install the HawkScan skill and StackHawk API skill using three plugin commands in your terminal.
  • Run Your First Scan: With your application running, prompt Codex to scan it by specifying the localhost port, and the agent will configure the scanner, run the scan, and display results in your terminal and the StackHawk platform.

What Happens After Codex Finds Vulnerabilities?

When the scan completes, Codex displays findings organized by severity level, along with details about the risk, confidence level, affected code paths, and vulnerable methods. In many cases, Codex automatically begins fixing vulnerabilities without prompting. If it doesn't, you can explicitly ask it to fix all findings, and the agent will read the surrounding code and apply context-appropriate patches.

For example, if the scanner finds SQL injection vulnerabilities caused by direct string concatenation, Codex will rewrite the code to use parameterized queries. If user input is returned to the browser without encoding, Codex will add output encoding. If security headers are missing, Codex will insert them. After applying fixes, the agent automatically rescans the application to confirm that vulnerabilities no longer reproduce.

How Can Teams Review and Triage Security Findings?

One of the most significant advantages of integrating security into Codex's workflow is that the agent can automatically review and triage findings without human intervention. The agent decides whether each vulnerability should be fixed, risk-accepted, or marked as a false positive, and it adds notes to document the reasoning. This entire process happens automatically, reducing the manual overhead of security reviews.

Teams that prefer manual review can still access the StackHawk console in a web browser. Unprocessed findings are marked as "New," and each finding offers three triage paths: Assigned, Risk Accepted, or False Positive. Whichever path is chosen, the platform requires a comment, ensuring that triage decisions are documented and survive team turnover. For questionable findings, the platform can generate a ready-to-run curl command that reproduces the attack, allowing developers to trace exactly what the scanner detected in their local application.

By embedding security testing into the same agent that builds features, Codex eliminates the workflow friction that typically delays vulnerability fixes. Developers no longer need to switch tools, wait for separate security scans, or manually coordinate fixes with security teams. The agent handles the entire loop, from feature completion to verified security remediation, all within the terminal environment where developers already work.