Datasets
Browse, publish, and manage shared dataset artifacts in the Dreadnode platform registry.
Datasets are versioned artifacts published into an organization registry so teams and agents can reuse the same inputs for evaluations, training, and repeatable experiments.
In the App IA, this page lives under Hub.
This page is about published datasets in the platform registry. If you need to load dataset rows in code, see SDK Data.
What the page is for
Section titled “What the page is for”The platform dataset page is the place to:
- browse dataset versions from your organization and the public catalog
- search, sort, and filter by tags, license, task category, format, and size
- inspect metadata such as row count, file format, and visibility
- download, publish, unpublish, or delete versions you own
Each card groups versions under one dataset name so you can move between releases without losing context.
Workflow
Section titled “Workflow”Datasets are the durable input side of the platform.
- curate the local dataset source
- inspect it before publishing
- publish a version to the Hub
- pin that exact version in evaluations, training, or optimization
- pull or download it later when you need the bytes locally again
The App page is primarily the decision surface in steps 3 and 4: which dataset exists, which version is current, and which version should another workflow consume.
Visibility and references
Section titled “Visibility and references”| Concept | What it means |
|---|---|
| org-scoped | visible only inside the owning organization |
| public | visible in the combined catalog across orgs |
| canonical name | shown as <owner>/<name> when the dataset comes from another org |
| pinned reference | use org/name@version for reproducible evaluations, training jobs, and automation |
The All view mixes public and org-local datasets. The org-only view limits the page to artifacts
owned by the current organization.
What agents should assume
Section titled “What agents should assume”- Prefer explicit versions. A dataset card may show many releases, but automation should pin one.
- Use metadata like
row_count,format,task_categories, andsize_categoryto choose the right artifact before download or job submission. - Treat a published dataset as durable registry state. Inline evaluation rows or local ad hoc files are separate workflows.
- Use the owning org when changing visibility or deleting a version.
Common workflows
Section titled “Common workflows”dreadnode dataset inspect ./datasets/support-promptsdreadnode dataset push ./datasets/support-prompts --publicUse Packages and Registry for publish and download operations, SDK Data when you need rows inside evaluation code, and the SDK API Client when you need registry lookups or dataset inspection from Python.