Skip to content

SDK Overview

Build agents, datasets, evaluations, scorers, tracing, and hosted workflows with the Dreadnode Python SDK.

The SDK is the Python surface for Dreadnode. Use it when you want agent workflows, datasets, evaluations, and improvement loops that live in code instead of only in the app or CLI.

For installation and account setup, start with /getting-started/installation and /getting-started/authentication. The pages here assume you are already working inside a Python environment and want to automate Dreadnode programmatically.

Think of the SDK as a stack of building blocks:

LayerWhat it doesStart here
packages / capabilitiesreusable artifacts and agent bundles you load or publishPackages & Capabilities
Generatorraw model calls with no tool loop or memoryGenerators
Agentmulti-step reasoning loop with tools, hooks, and trajectoriesAgents
Toolexternal actions an agent can callTools
transformsinput rewriting and prompt adaptationTransforms
Scorerreusable numeric metric or pass/fail gateScorers
Dataset / LocalDatasetpersistent input sets for analysis and benchmarkingData
Evaluationrepeatable benchmark run over a datasetEvaluations
AIRTprebuilt attack studies and grouped red-team assessmentsAIRT
studies / samplersthe iterative search loop behind optimization and AIRTStudies & Samplers
tracingexecution telemetry, spans, and artifactsTracing
optimization / traininglocal and hosted improvement workflows once your datasets and metrics are stableOptimization and Training

Use the SDK when you want:

  • workflows checked into a repo
  • reusable Python abstractions
  • notebook, script, or CI execution
  • custom scorers, hooks, or tools composed in code

Use the CLI when you want:

  • login and profile management
  • package publishing and registry operations
  • hosted job submission from a shell
  • quick inspection without writing Python

The two surfaces are complementary. A common pattern is to build and test locally in Python, then use the CLI for package publishing or job submission.

import dreadnode as dn
dn.configure(
server="https://app.dreadnode.io",
api_key="dn_...",
organization="acme",
workspace="research",
)

Most teams end up following this sequence:

  1. Load or publish the capability, dataset, model, or environment you need.
  2. Build an Agent and the Tool objects it needs.
  3. Attach Scorer objects that express quality, safety, or policy checks.
  4. Run an Evaluation or AIRT workflow and inspect the result plus traces.
  5. Move to optimization or training only after the task, dataset, and scoring logic are stable.

If you need the shortest path to “something running,” start with Agents, then come back for Scorers, Data, and Evaluations.