System Exploitation via AI Gateway - Dreadnode Documentation

Overview

This attack class involves using an AI model not as the final target, but as an unwitting intermediary or a “gateway” to attack a separate, downstream system. The AI model, often an LLM or an OCR engine, processes untrusted user input and transforms it into a format—such as code, a database query, or a shell command—that is then executed by another part of the application.

Why it Matters: This is one of the most severe risks in AI-integrated systems. A successful exploit can lead to traditional, high-impact consequences, including:
- Remote Code Execution (RCE)
- Data Exfiltration
- Server-Side Request Forgery (SSRF)
- Denial of Service (DoS)

Technical Mechanics & Foundations

The exploitability of these systems hinges on a common architectural pattern: User Input -> AI Model (Interpreter) -> Backend System (Executor). The attacker’s goal is to craft an input that the AI model will innocently translate into a malicious payload for the backend.

The AI as a “Natural Language Shell”: An attacker can use prompt injection to trick an LLM with tool-access into generating and executing malicious commands.
The AI as a “SQL Co-pilot”: An attacker can craft a prompt that causes an LLM to generate a vulnerable SQL query, leading to SQL Injection.
The AI as an “OCR-to-API” Pipeline: An attacker can create an image with text that, when extracted by an OCR model, forms a malicious payload for a downstream system.

Challenge Arena

Crawl: Basic Command & Code Generation
- brig1: Instruct an LLM to use a connected tool in an unintended way to execute a command on the backend system.
- turtle: Coax a security-aware LLM into generating Python code with a specific, known vulnerability.
Walk: Exploiting Structured Data & Database Queries
- brig2: Manipulate an LLM’s tool-use capabilities to access and read data from an unauthorized file path.
- librarian: Craft a natural language question that causes an LLM to generate and execute a malicious SQL query.
Run: Multi-Modal and Advanced Exploitation
- pixelated: Exploit a multi-stage pipeline by creating an image that an OCR model misinterprets as a malicious command.

​Overview

​Technical Mechanics & Foundations

​Challenge Arena

Overview

Technical Mechanics & Foundations

Challenge Arena