StruQ: Defending Against Prompt Injection with Structured Queries

Abstract

Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications, which perform text-based tasks by utilizing their advanced language capabilities. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is listed as the #1 threat to LLM-integrated applications, where an LLM input contains a trusted prompt (instruction) and an untrusted data (user documents, web retrieval, results from API calls, etc) with potentially injected instructions (Ignore previous instructions and …) to arbitrarily manipulate the LLM.

We introduce structured queries, a general approach to tackle this problem. Structured queries separate prompts and data into two channels. We implement a system that supports structured queries. This system is made of (1) a secure front-end that formats a prompt and user data into a special format, and (2) a specially trained LLM that can produce highquality outputs from these inputs. The LLM is trained using a novel fine-tuning strategy: we convert a base (non-instructiontuned) LLM to a structured instruction-tuned model that will only follow instructions in the prompt portion of a query. To do so, we augment standard instruction tuning datasets with examples that also include instructions in the data portion of the query, and fine-tune the model to ignore these. Our system significantly improves resistance to prompt injection attacks, with little or no impact on utility.

Background

LLM-Integrated Applications

Design an instruction (prompt) to serve users by processing their data via an LLM.

• Prompt: Trusted (from the developer)

• LLM: Trusted (from the developer or an API provider)

• Data: Untrusted (from a random user)

Prompt Injection Attack

The adversary injects an instruction in data to override the prompted instruction.

Listed as the #1 security threat for LLM-integrated applications by OWASP.

Example: A university wants to evaluate applicants’ CV with an LLM.

Structured Queries (StruQ)

StruQ separates the prompt part and the data part into two channels. The system is made of

• A secure front-end that formulates a prompt and data into a special-format input.

• A structured instruction-tuned LLM that produces high-quality outputs from these inputs.

Secure Front-End

StruQ separates the prompt and data by delimiters that can only be used by the system designer.

The secure-front-end reserves special tokens ([MARK], ...) as delimiters, and filters the data out of any separation delimiter.

Structured Instruction-Tuned LLM

StruQ simulates prompt injections in training for the LLM to learn to ignore any injected instructions in the data part.

The generated dataset contains clean samples and samples with injected instructions (also sampled from the dataset).

We fine-tune the LLM on our structured instruction-tuning dataset using standard SFT algorithm.

Experiments

StruQ LLMs maintain general-purpose utility (AlpacaEval2 WinRate).

StruQ enjoys <2% attack success rates under optimization-free attacks (Max ASR Opt.-Free).

StruQ significantly reduces the success rates of optimization-based attacks (Max ASR Opt.-Based).

BibTeX

@inproceedings{chen2024struq,
  title={StruQ: Defending against prompt injection with structured queries},
  author={Chen, Sizhe and Piet, Julien and Sitawarin, Chawin and Wagner, David},
  booktitle={USENIX Security Symposium},
  year={2025}
}