Skip to content

Web App Pentesting

Use the web-security capability to automate web app reconnaissance, testing, and reporting.

Use this recipe when you need browser-aware or stateful web testing inside isolated compute and want a clean path from one exploratory finding to something you can rerun later.

  • you are doing authorized web reconnaissance or application testing
  • you need the runtime to carry browser, session, or web-tool state for you
  • you want transcripts and traces that explain how the finding was reached
  • the scoped target domains, paths, tenants, and test accounts
  • any credentials or secrets the runtime is allowed to use
  • the correct workspace and project for storing evidence
  • legal and operational stop conditions

1. Start a runtime with the web capability

Section titled “1. Start a runtime with the web capability”
Terminal window
dn --capability web-security --model openai/gpt-4o

You can also load the capability from the TUI capability manager with Ctrl+P, then switch to its agent with Ctrl+A or /agent <name>.

2. Put scope and credentials in the first prompt

Section titled “2. Put scope and credentials in the first prompt”

Before the runtime explores anything, state:

  • what is in scope
  • what credentials or secrets it may use
  • what rate limits or stop conditions apply
  • what kind of evidence you expect back

3. Explore until you have one candidate finding

Section titled “3. Explore until you have one candidate finding”

Use the session like an operator console:

  • ask the agent to explain its next step before it takes it
  • keep an eye on the transcript to make sure the plan stays inside scope
  • move to runtime state or traces if the issue may be environment-related rather than app-related

Interactive sessions are where you learn which auth flows, upload paths, or stateful browser sequences are worth preserving.

For a real finding, keep both:

  • the session transcript for narrative and operator intent
  • the traces for exact tool sequence, timing, and execution detail

Use Managing sessions when the finding starts from one conversation. Use Traces & analysis when the question becomes “does this pattern show up elsewhere in the same project?” and you need Charts, Data, or Notebook.

5. Promote stable checks into tasks or evaluations

Section titled “5. Promote stable checks into tasks or evaluations”

Once a check is stable:

  • package the environment and verifier as a task
  • pin representative prompts or inputs in a dataset
  • run hosted evaluations instead of rediscovering the issue manually each time
  • the scope statement and accounts used
  • the session ID and traces for one representative finding
  • any requests, responses, or artifacts needed to verify the issue later
  • the task or evaluation candidate if the check is now repeatable
  • if the runtime cannot reach the target or use the expected tools, debug capability or environment setup before analyzing application behavior
  • if the workflow is still exploratory, stay in the runtime session rather than forcing it into an evaluation too early
  • treat agent output as candidate findings and verify them before reporting or escalating