Versions & metrics

Compare model releases side-by-side, attach evaluation metrics, promote with aliases, and retire versions.

Once a model name has two or more versions, the registry stops being a filing cabinet and starts being a release-management surface. Compare, annotate, promote, delete — the mechanics on this page.

Compare versions

dn model compare support-assistant 1.0.0 1.1.0 1.2.0

support-assistant version comparison
┃                     ┃ 1.0.0                    ┃ 1.1.0                    ┃ 1.2.0                    ┃
┇ framework           ┇ safetensors              ┇ safetensors              ┇ safetensors              ┇
┇ task                ┇ text-generation          ┇ text-generation          ┇ text-generation          ┇
┇ architecture        ┇ LlamaForCausalLM         ┇ LlamaForCausalLM         ┇ LlamaForCausalLM         ┇
┇ base model          ┇ meta-llama/Llama-3.1-8B  ┇ meta-llama/Llama-3.1-8B  ┇ meta-llama/Llama-3.1-8B  ┇
┇ size                ┇ 14850.3 MB               ┇ 14850.3 MB               ┇ 14892.1 MB               ┇
┇ aliases             ┇ -                        ┇ staging                  ┇ champion                 ┇
┇ intent_accuracy     ┇ 0.812                    ┇ 0.847                    ┇ 0.873                    ┇
┇ f1                  ┇ 0.79                     ┇ 0.83                     ┇ 0.86                     ┇

compare takes 2–5 versions. Every attached metric gets its own row, so the tradeoffs across releases fit on one screen.

Add --json for machine-readable output. The Hub renders the same comparison visually with metric charts over version history.

Attaching metrics

Metrics are version-level key/value pairs you attach after a model is published — typically the output of an evaluation run you want to record alongside the weights.

dn model metrics [email protected] \
  intent_accuracy=0.873 \
  f1=0.86 \
  pass_at_1=0.71

Updated acme/[email protected]: intent_accuracy=0.873, f1=0.86, pass_at_1=0.71

Values that parse as integers or floats are stored as numbers; anything else is stored as a string. Updates merge — metrics you don’t mention are preserved.

Metrics in downstream workflows

A common pattern: run an evaluation, then record the top-line scores back onto the model version so the registry entry reflects how it did:

# Score the model against your evaluation suite (locally or hosted), then:
dn model metrics [email protected] \
  intent_accuracy=0.873 f1=0.86

The dn model compare table then shows the eval scores beside framework, architecture, and aliases. Hosted evaluations reach the model through its inference endpoint — see Using in code for loading a registry artifact into a generator or serving it externally.

Aliases

Aliases are human-friendly labels that float across versions. Use them when a release has a role — champion, staging, latest-stable — and you want to promote without rewriting downstream configs.

dn model alias [email protected] champion

champion → acme/[email protected]

Setting an alias that already exists on another version moves it — there is exactly one champion per model name. Remove an alias:

dn model alias [email protected] champion --remove

Promote a release

Aliases + metrics + comparison give you the full promotion loop:

Train a new version (@1.2.0) and push it.
Run your evaluation suite against the new version.
dn model metrics [email protected] ... with the scores.
dn model compare support-assistant 1.1.0 1.2.0 — confirm it’s actually better on the metrics you care about.
dn model alias [email protected] champion — move the alias; downstream consumers that follow champion start loading the new version.

If something regresses in production, move the alias back: dn model alias [email protected] champion.

Retire a version

dn model delete acme/[email protected]

delete requires a version — there’s no “delete the whole family” verb. The CLI confirms before deleting; pass --yes for automation:

dn model delete acme/[email protected] --yes

Deletion is permanent. Inference and training configs that pin the deleted version will fail to resolve. Run dn model compare <name> <versions...> first — the aliases row shows which version a champion or staging label is currently attached to, so you can reassign before deleting. Aliases on a deleted version disappear with it.

What to reach for next

Push a new version → Publishing
Browse the registry and pin references → Catalog
Load a promoted version in code → Using in code
Every CLI verb → dn model