Quickstart
Author a dataset directory, publish a version to your organization, and reference it from an evaluation.
Package a parquet file as a Dreadnode dataset, push it, and pin the result in an evaluation — all from the CLI.
Prerequisites
Section titled “Prerequisites”- The Dreadnode CLI authenticated (
dn login) — see Authentication - Python with
pyarrowandpandasinstalled - One dataset in tabular shape (parquet, csv, json, or jsonl)
1. Lay out the directory
Section titled “1. Lay out the directory”support-prompts/ dataset.yaml train.parquetA minimal dataset.yaml:
name: support-promptsversion: 0.1.0summary: Sampled support tickets for intent evaluation.format: parquetname and version are optional — the directory name fills in for name, and version defaults to 0.1.0. Fill them in anyway; the registry record is easier to read with them set. See the manifest reference for every field.
2. Inspect locally
Section titled “2. Inspect locally”dn dataset inspect ./support-prompts format: parquet rows: 1,234
Schema┏━━━━━━━━━━━━┳━━━━━━━━━━━┓┃ Column ┃ Type ┃┡━━━━━━━━━━━━╇━━━━━━━━━━━┩│ ticket_id │ string ││ body │ string ││ intent │ string │└────────────┴───────────┘inspect reads dataset.yaml, loads each artifact to confirm it parses, and infers schema and row count when the manifest omits them. Use it as your local pre-flight — if this fails, the push will too.
3. Push to the registry
Section titled “3. Push to the registry”dn dataset push ./support-promptsPushed acme/[email protected] (sha256:9ab81fc1...)The version goes to your organization (acme here) and is visible only to that org by default. The qualified name is org/name@version.
4. Load it from code
Section titled “4. Load it from code”import dreadnode as dnfrom dreadnode.datasets import Dataset
dn.pull_package(["dataset://acme/support-prompts:0.1.0"])dataset = Dataset("acme/support-prompts", version="0.1.0")
df = dataset.to_pandas()print(df.head())pull_package downloads the version you just pushed; Dataset(...) opens it by name. See Using in code for every entry point and the difference between pull_package and load_package.
5. Bump a version
Section titled “5. Bump a version”Edit the dataset source, bump version in dataset.yaml, and push again:
# dataset.yamlversion: 0.2.0dn dataset push ./support-prompts# → acme/[email protected]Older versions stay in the registry. Point downstream configs at @0.2.0 when you’re ready to adopt the change.
What to reach for next
Section titled “What to reach for next”- Use HuggingFace data or add splits → Authoring
- Make the dataset public, retire a version, or restrict visibility → Publishing
- Feed the dataset into evaluations, training, or AIRT → Using in code
- Browse what’s already in the registry → Catalog
- Every CLI verb →
dn dataset