Skip to content

Datasets

Versioned datasets for training, optimization, and evaluation.

Terminal window
$ dn dataset <command>

Versioned data for training, optimization, and evaluation — the ground truth your agents learn from.

Terminal window
$ dn dataset inspect <path>

Preview a local dataset directory before publishing.

Reads dataset.yaml and the data files to show schema, row counts, splits, and format — so you can catch problems before pushing.

Options

  • <path>, --path (Required) — Dataset directory containing dataset.yaml.
  • --json (default False) — Output raw JSON instead of a table.

Aliases: upload

Terminal window
$ dn dataset push <path>

Publish a dataset to your organization’s registry.

Packages a dataset directory (with dataset.yaml manifest) and uploads it as a versioned artifact. Once published, reference it in training, optimization, and evaluation workflows.

Options

  • <path>, --path (Required) — Dataset directory containing dataset.yaml.
  • --name — Override the registry name.
  • --skip-upload (default False) — Build and validate locally without publishing.
  • --publish (default False) — Ensure the dataset is publicly discoverable after publishing.
Terminal window
$ dn dataset publish <refs>

Make one or more dataset families visible to other organizations.

Options

  • <refs>, --refs (Required)
Terminal window
$ dn dataset unpublish <refs>

Make one or more dataset families private.

Options

  • <refs>, --refs (Required)

Aliases: ls

Terminal window
$ dn dataset list

Show datasets in your organization.

Options

  • --search — Search by name or description.
  • --limit (default 50) — Maximum results to show.
  • --include-public (default False) — Include public datasets from other organizations.
  • --json (default False) — Output raw JSON instead of a summary.
Terminal window
$ dn dataset info <ref>

Show details and available versions for a dataset.

Version is optional — defaults to the latest.

Options

  • <ref>, --ref (Required) — Dataset to inspect (e.g. my-dataset, [email protected]).
  • --json (default False) — Output raw JSON instead of a summary.

Aliases: rm

Terminal window
$ dn dataset delete <ref>

Remove a dataset version from the registry.

Options

  • <ref>, --ref (Required) — Dataset to delete (e.g. [email protected]). Version is required.
  • --yes, -y (default False) — Skip the confirmation prompt.

Aliases: download

Terminal window
$ dn dataset pull <ref>

Pull a dataset to your local machine.

Version is optional — defaults to the latest. Without —output, prints a pre-signed download URL you can use with curl or a browser.

Options

  • <ref>, --ref (Required) — Dataset to pull (e.g. my-dataset, [email protected]).
  • --output — Save to this path instead of printing the URL.
  • --split — Download a specific split (e.g. train, test).