Arcjet for AI Agents: Securing the Attack Surface Inside LLM Apps
Arcjet is moving its in-app security guards into AI agents, adding runtime checks against prompt injection, unsafe file reads, and risky web fetches. Here's why agentic apps need guardrails at the point of action, not just the network edge.
A traditional web application firewall sits at the edge of your network. It inspects HTTP requests before they reach your code, flags injection payloads, blocks known-bad IP ranges, and rate-limits abusive clients. That model held up for a decade because the request was the attack surface — the dangerous thing was always something a user sent you directly.
AI agents break that assumption. When you hand an LLM a set of tools — a file reader, an HTTP client, a database query function — the agent decides at runtime what to do with them. The request that started the session can look completely benign. The damage shows up three or four reasoning steps later: the agent reads a file it shouldn’t, fetches a URL an attacker planted in a document, or runs a tool call that was never in your test plan. A firewall at the edge sees none of it.
Arcjet’s response is to move the guard. Arcjet is a developer security SDK — bot detection, rate limiting, email validation, and shielding against common attacks — that runs as code inside your application instead of as separate infrastructure at the perimeter. Its recent shift extends that same in-process model into the agent’s action loop itself.
Why a network WAF goes blind on agentic apps
The gap is structural, not a tuning problem. An edge WAF inspects what crosses the network boundary once, at the start of a request. An agent’s risky behavior is generated internally, after that request is already inside your trust zone. By the time the agent calls a tool, there is no HTTP request for an edge device to inspect — the call happens in memory, between your code and the action the model chose.
It gets worse because the agent runs with your credentials. It holds your database connection, your API keys, your service-account permissions. An attacker who can influence the agent’s instructions doesn’t need to breach anything — they borrow the agent’s authority. Security engineers call this the confused deputy problem, and agentic apps are full of deputies.
The three actions a guard inside the agent watches
Moving the guard inside means checking the agent’s decisions at the moment it acts, not when the session began. Three action types carry most of the risk.
Prompt injection. Agents read untrusted text constantly — web pages, PDFs, support tickets, code comments, the output of earlier tool calls. Any of it can carry instructions aimed at the model: ignore the current task, exfiltrate a secret, call a tool with attacker-chosen arguments. The model has no reliable way to separate data it should process from instructions it should follow when both arrive as plain text in one context window.
File reads. An agent with filesystem access is one bad instruction away from reading .env, SSH keys, or another tenant’s data. The read is a legitimate capability — you gave it the tool on purpose — so nothing flags it unless something inspects the path before the read runs.
Web fetches. An agent that fetches URLs is a server-side request forgery engine waiting for input. Plant a link to a cloud metadata endpoint such as 169.254.169.254, or to an internal admin panel, and the agent will fetch it from inside your network and hand the response back to whoever is steering it.
A guard inside the agent sits between the decision and the action: before the file read runs, check the path; before the fetch leaves, check the destination; before tool output flows back into the model’s context, scan it for injection patterns. These are deterministic checks — they don’t ask the model to police itself.
Where this fits in your stack today
Because Arcjet runs in-process — across Node.js, Next.js, Bun, and Deno — adding an in-agent guard is a code change, not an infrastructure project. There’s no new proxy to route traffic through and no separate service to operate. The guard is a function call at the point where your agent is about to act.
Treat it as one layer of defense in depth, not a replacement for anything else you run:
- Keep your edge WAF. It still handles volumetric attacks, crawler traffic, and classic injection against your public routes.
- Scope the agent’s credentials down. A guard is a backstop; an agent that physically cannot reach the production database is safer than one merely asked not to.
- Allowlist fetch destinations instead of blocklisting bad ones. You already know the handful of domains your agent legitimately needs.
- Treat every tool result as untrusted input, the same way you treat a form submission.
Cursor
If you're building the agent itself, an AI-native editor tightens the loop where these guards get wired in. Cursor keeps the agent code, the tool definitions, and the security checks in one context so you can see exactly where each check sits.
Free tier; Pro at $20/mo
Affiliate link · We earn a commission at no cost to you.
The autonomy that makes agents useful is the same autonomy that makes them dangerous. Every tool you add widens what a successful injection can reach. Guards at the action boundary won’t make an agent un-hackable, but they move the security decision somewhere you control — your code, with a deterministic answer — instead of leaving it to a model that was never designed to be a security boundary.
FAQ
Does an in-agent guard replace my existing WAF? +
Can't the system prompt just tell the model to ignore injected instructions? +
What does checking every tool call cost in latency? +
Related reading
2026-05-26
Orthrus: Parallel Token Generation That Doesn't Change Your Model's Output
Orthrus injects diffusion attention into each layer of a frozen autoregressive Transformer to generate 32 tokens in parallel — without altering the base model's output distribution.
2026-05-26
NVIDIA Warp Review: GPU-Accelerated Python for Simulation, Robotics, and Differentiable ML
NVIDIA Warp compiles Python functions to CUDA kernels for differentiable physics and robotics. We benchmarked it against JAX and Taichi to figure out when it earns a spot in your stack.
2026-05-26
OpenAI Daybreak vs Anthropic Glasswing: Convergent Bets on LLM Security Tooling
OpenAI's Daybreak (GPT-5.5 + Codex Security) and Anthropic's Glasswing shipped near-identical AppSec products the same week. What the convergence means and how to pick.
2026-05-26
Macchiato Day 2: Live Token Metrics and Parallel AI Terminals Reviewed
Macchiato's day-2 build adds a live token/cost sidebar and keyboard shortcuts for swapping between Claude Code and OpenCode in one terminal. Here's what shipped and what it means.
2026-05-26
Macchiato Day 2: Live Token Metrics and Parallel Terminals for Claude Code and OpenCode
Macchiato Day 2 adds a 2-4 pane terminal grid, live token and cost meters, and configurable spend ceilings for Claude Code and OpenCode sessions. Here is what it actually does and who should install it.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.