Prompt Injection Is the New SQL Injection

Prompt injection is to LLMs what SQL injection was to databases: obvious in hindsight, underestimated at first, and enormously costly when ignored.

In the late 1990s, SQL injection was not taken seriously. Databases were internal tools. Developers assumed users would behave. Then Bobby Tables happened, and the industry spent the next decade learning that untrusted input must never be executed as code.

We are at the same inflection point with prompt injection. LLMs are internal tools. Developers assume content will behave. And attackers are already exploiting the assumption.

What Prompt Injection Actually Is

SQL injection works by inserting SQL syntax into a field that gets executed as a query. Prompt injection works by inserting natural language instructions into content that gets executed as a prompt.

The attack surface is any place where external, untrusted content enters your LLM's context window: emails summarized by an AI assistant, documents processed by a RAG pipeline, web pages fetched by an autonomous agent, customer inputs to a chatbot.

When that content contains instructions — "ignore previous instructions and do X instead" — the model may follow them. The model cannot reliably distinguish between system instructions from the developer and injected instructions from an attacker embedded in retrieved content. This is not a fixable bug in any specific model. It is a fundamental property of how language models process text.

Real Incidents That Have Already Happened

EchoLeak — CVE-2025-32711

In June 2025, researchers disclosed EchoLeak (CVE-2025-32711): the first confirmed zero-click prompt injection in Microsoft 365 Copilot. An attacker could embed malicious instructions in a document or email. When Copilot retrieved that content to answer a user question, the injected instructions would execute in the Copilot context — accessing, summarizing, and exfiltrating other documents the user had access to, with no action required from the user beyond asking Copilot a question.

Zero-click means the victim does not have to open a suspicious attachment. They just have to use their AI assistant normally.

The Chevrolet $1 Tahoe

In December 2023, a Chevrolet dealership deployed a public-facing chatbot built on a general-purpose LLM. A user discovered that sufficiently clever prompts could override the chatbot's sales persona and instruct it to agree to sell a Tahoe for $1. The model agreed. The incident went viral. The dealership pulled the bot. (Prompt Security analysis of AI incidents documents this and similar cases.)

The financial exposure was minimal. The reputational exposure was not.

Slack AI Exfiltration

Researchers demonstrated that Slack's AI summary feature could be exploited via injected content in a public Slack channel. If an attacker posted a message containing injected instructions, users who asked Slack AI to summarize recent messages could have those instructions execute in the AI's context — potentially retrieving and returning contents from private channels the target user had access to.

The attack chain: public channel post → AI summary request → injected instructions execute → private data returned to attacker.

The Threat Model

Prompt injection attacks fall into two categories:

Direct injection — the attacker controls input directly (a chatbot user input field, an API request). Easier to defend; input validation and rate limiting help.

Indirect injection — the attacker embeds malicious instructions in content that the AI will later retrieve and process (an email, a document, a web page). Harder to defend; the injected content may arrive through a trusted source.

Indirect injection is the more dangerous and the more underappreciated category. Your AI assistant might be trustworthy. The email it reads, the document it retrieves, the web page it fetches — those are untrusted content from untrusted sources. Every LLM integration that fetches external content is exposed.

Defenses That Actually Work

Unlike SQL injection — which has a clean solution in parameterized queries — prompt injection does not have a single silver bullet. Defense requires layers:

Input and Content Sanitization

Before external content enters the LLM context, scan it for known injection patterns: instruction-override phrases, jailbreak templates, adversarial formatting. This catches unsophisticated attacks. Sophisticated attacks will evade pattern matching — hence the need for additional layers.

Privilege Separation in Tool Use

The most effective defense is architectural: agents and copilots should not have access to tools and data beyond what is needed for their specific task. An attacker who successfully injects instructions into an AI that can only read from a single restricted knowledge base has achieved far less than one who has injected into an agent with email, calendar, and CRM write access. Apply least-privilege principles as described in zero trust agent design.

Sandboxed Execution Contexts

Treat retrieved external content as untrusted code. Process it in a sandboxed context where the model's tool-calling capabilities are restricted. Only after sanitization and analysis should content influence the primary agent's behavior.

Output Monitoring

Monitor LLM outputs for anomalous patterns: unexpectedly large responses, outputs that include data from outside the expected scope, tool calls that were not initiated by a user request. Anomaly detection on outputs is the final safety net.

Human-in-the-Loop for High-Risk Actions

Any action with irreversible consequences — sending an email, executing a transaction, deleting a record — should require explicit human confirmation before execution. An injected instruction that tells an agent to "forward all emails to attacker@example.com" is defeated if external-send actions always require a human to approve.

The Parallel to SQL Injection Is Not Hyperbole

SQL injection was obvious in retrospect. It was exploited for a decade before parameterized queries became the default. The industry lost an enormous amount of data, money, and trust in that window.

Prompt injection is in the same early, underestimated phase. The difference is that we have seen this movie before. We do not have to wait for the decade of breaches to take it seriously.

Building AI-powered applications and want an injection threat assessment before they go live? Talk to JP Stratton.