Learning Path: The Security Professional

Audience: Penetration Testers, Application Security Engineers, Red Teamers. Assumed Knowledge: Strong understanding of traditional web and infrastructure vulnerabilities (e.g., Injection, RCE, LFI), comfortable with scripting (Python), and an offensive mindset. Goal: To map existing security skills onto the AI/ML attack surface, treating AI models as both a target and a vector for familiar exploits.

Module 1: Prompt Injection - The New Command Line

Objective: Understand how Large Language Models (LLMs) blur the line between trusted instructions and untrusted user data, creating a new vector for injection attacks that mirrors classic command and SQL injection. Bridging Context: In traditional AppSec, vulnerabilities often arise when user input is concatenated directly into a string that is later executed by an interpreter (e.g., a shell or a database). LLMs represent a new, highly complex interpreter. The core vulnerability is the same: a lack of separation between code (the system prompt) and data (the user prompt). Your goal in this module is to learn how to manipulate this interpreter.

Core Path

Read: Prompt-Based Evasion and Exfiltration
- Focus: Absorb the core concepts of Instruction Hijacking and Role-Playing. Frame these not as “tricking the AI” but as “controlling the execution flow of the language interpreter.”
Challenge (Crawl): whatistheflag1
- Task: This is the “Hello, World!” of prompt injection. Your goal is to bypass a simple guardrail to leak a secret. Experiment with direct commands and simple rephrasing.
- OWASP: LLM01: Prompt Injection, LLM06: Sensitive Information Disclosure
- MITRE ATLAS: TA0004: Evasion (T1012: Evade ML Model)
Challenge (Walk): whatistheflag2
- Task: This challenge introduces a basic defense: a keyword blocklist. Your task is to evade this filter. Think about this as evading a simple Web Application Firewall (WAF) signature. How can you achieve your objective without using the forbidden words?
- OWASP: LLM01: Prompt Injection, LLM06: Sensitive Information Disclosure
- MITRE ATLAS: TA0004: Evasion (T1012: Evade ML Model)

Deeper Exploration

For Precise Control: Try the puppeteer series (puppeteer1, puppeteer2). These challenges are not about leaking a secret, but about forcing the model to produce an exact string, honing your skills in precise output control.
- OWASP: LLM01: Prompt Injection
- MITRE ATLAS: TA0015: Impact (T1052: Manipulate ML Model Output)
For Constraint Bypassing: Try squeeze1. This introduces an output token limit, forcing you to craft a prompt that elicits a very concise response.
- OWASP: LLM01: Prompt Injection, LLM06: Sensitive Information Disclosure
- MITRE ATLAS: TA0004: Evasion (T1012: Evade ML Model)

Module 2: System Exploitation - Weaponizing the AI Gateway

Objective: Use the prompt injection skills from Module 1 to turn an AI model into a tool for attacking a traditional backend system. This is where AI security intersects directly with high-impact, classic vulnerabilities. Bridging Context: The AI model here is not the final target. It is a parser, an unwitting accomplice that translates your natural language into a payload. Your target is the downstream component that trusts the AI’s output implicitly.

Core Path

Read: System Exploitation via AI Gateway
- Focus: Understand the User -> AI -> Backend architectural pattern and how it can be exploited.
Challenge (Walk): turtle
- Task: Trick an LLM into writing insecure Python code that is flagged by a static analysis tool. This is a safe environment to practice generating vulnerable code.
- OWASP: LLM02: Insecure Output Handling, LLM08: Excessive Agency
- MITRE ATLAS: TA0006: Execution (T1015: Execute ML Attacks)
Challenge (Walk): librarian
- Task: Craft a prompt that results in a malicious SQL query, leaking data from a hidden table.
- OWASP: LLM01: Prompt Injection, LLM02: Insecure Output Handling, LLM07: Insecure Plugin Design
- MITRE ATLAS: TA0006: Execution (T1015: Execute ML Attacks)

Deeper Exploration (Run & Boss Level)

For Advanced Command Injection: Try brig1.
- Note: A difficult challenge requiring careful prompt construction to achieve RCE.
- OWASP: LLM01: Prompt Injection, LLM02: Insecure Output Handling, LLM08: Excessive Agency
- MITRE ATLAS: TA0006: Execution (T1015: Execute ML Attacks)
For Multi-Modal Exploitation: Try pixelated.
- Note: A multi-stage attack chaining an image perturbation with XML injection.
- OWASP: LLM02: Insecure Output Handling
- MITRE ATLAS: TA0004: Evasion (T1012: Evade ML Model), TA0006: Execution (T1015: Execute ML Attacks)

Module 3: Continuous Domain Evasion - Fuzzing with Gradients

Objective: Learn to attack models that operate on continuous data like images. Frame this as an evolution of fuzzing: instead of random inputs, you will use the model’s own logic to find the most efficient input to cause a failure. Bridging Context: Think of an image classifier as a complex program that takes a massive byte array as input. You want to find a slightly modified byte array that causes a logic error (a misclassification). Instead of random bit-flipping, you can use the model’s gradients—a mathematical clue—to guide your modifications in the most effective direction.

Core Path

Read: Adversarial Perturbations
- Focus: Understand the concepts of Decision Boundaries and Gradient-Based Attacks.
Challenge (Crawl): granny
- Task: Your first hands-on adversarial image attack. Modify an image to cause a specific misclassification.
- OWASP: N/A (Not an LLM challenge)
- MITRE ATLAS: TA0002: Resource Development (T1006: Craft Adversarial Data), TA0004: Evasion (T1012: Evade ML Model)
Challenge (Walk): granny 2
- Task: Create an adversarial attack that is robust enough to survive JPEG compression.
- OWASP: N/A
- MITRE ATLAS: TA0002: Resource Development (T1006: Craft Adversarial Data), TA0004: Evasion (T1012: Evade ML Model)

Module 4: AI Supply Chain & Forensics

Objective: Apply traditional security principles to the artifacts and processes of the AI/ML lifecycle. Bridging Context: AI models are just files. The data they are trained on is just data. These artifacts can be tampered with, and they can contain vulnerabilities, just like any other software component.

Core Path

Read: Model Integrity Auditing
- Focus: This is equivalent to file integrity monitoring and reverse engineering.
Challenge (Walk): audit
- Task: Analyze a model file to find a malicious modification.
- OWASP: LLM05: Supply Chain Vulnerabilities
- MITRE ATLAS: TA0015: Impact (T1051: Degrade ML Model)
Read: Malicious Model Files
- Focus: This directly maps to insecure deserialization vulnerabilities you may already know. The pickle format is the vector.
Challenge (Run): pickle
- Task: Craft a malicious pickle file that bypasses static analysis checks to gain code execution.
- OWASP: LLM05: Supply Chain Vulnerabilities
- MITRE ATLAS: TA0005: Initial Access (T1013: Exploit Vulnerability in ML Model)

This learning path is designed to take a security professional from the most familiar concepts (injection) to the most novel (gradient-based evasion), always framing the new material in the context of their existing skills. It intentionally back-loads the most difficult challenges like brig1 and pixelated until after the foundational concepts have been covered.

​Module 1: Prompt Injection - The New Command Line

​Core Path

​Deeper Exploration

​Module 2: System Exploitation - Weaponizing the AI Gateway

​Core Path

​Deeper Exploration (Run & Boss Level)

​Module 3: Continuous Domain Evasion - Fuzzing with Gradients

​Core Path

​Module 4: AI Supply Chain & Forensics

​Core Path

Module 1: Prompt Injection - The New Command Line

Core Path

Deeper Exploration

Module 2: System Exploitation - Weaponizing the AI Gateway

Core Path

Deeper Exploration (Run & Boss Level)

Module 3: Continuous Domain Evasion - Fuzzing with Gradients

Core Path

Module 4: AI Supply Chain & Forensics

Core Path