Using in code

Download a published model, load weights and tokenizer with LocalModel, and feed it into a generator or evaluation.

The SDK gives you two entry points to a published model: downloading the artifact into local storage, and loading the weights and tokenizer through LocalModel or HuggingFace.

Goal	Use
Download a registry model so code can load it	`dn.pull_package(["model://org/name:version"])`
Open a registry model already cached locally	`Model("org/name", version=...)` or `dn.load_package("model://...")`
Load a HuggingFace model into local storage	`dn.load_model("meta-llama/Llama-3.1-8B-Instruct", task=...)`
Publish a local source back to the registry	`dn.push_model("./path")` (see Publishing)

Pull a published model

import dreadnode as dn
from dreadnode.models import Model

dn.pull_package(["model://acme/support-assistant:1.2.0"])
model = Model("acme/support-assistant", version="1.2.0")

dn.load_package is the alternate entry point when the package is already local:

model = dn.load_package("model://acme/[email protected]")

Both return a Model — the published-artifact handle. Its properties (name, version, framework, task, architecture, files) read from the manifest without further network calls.

Load weights and tokenizer

Model.to_hf() reconstructs the artifact directory on disk and hands it to HuggingFace from_pretrained:

hf_model = model.to_hf()
tokenizer = model.tokenizer()

Extra keyword arguments are forwarded. Common ones:

import torch

hf_model = model.to_hf(
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=False,
)

to_hf() dispatches to the right HF class based on task from the manifest (AutoModelForCausalLM, AutoModelForSequenceClassification, etc.). When task is missing, it falls back to AutoModel.

For raw filesystem access — serving with vLLM, converting the checkpoint, or running tools that expect a model directory — call model_path():

path = model.model_path()
#   model.safetensors
#   config.json
#   tokenizer.json
#   ...

The directory is materialized on first access and reused on subsequent calls against the same object.

Load a HuggingFace model into local storage

import dreadnode as dn

local_model = dn.load_model("meta-llama/Llama-3.1-8B-Instruct", task="text-generation")
hf_model = local_model.to_hf(torch_dtype="bfloat16", device_map="auto")

load_model caches the HuggingFace download in Dreadnode storage. The first call downloads; subsequent calls read from disk. Pass model_name to override the local storage name.

To publish that cached model back to the Dreadnode registry, re-emit it as a directory with a model.yaml and push — see Publishing.

Run a generator against the loaded model

Wrap the loaded weights in a TransformersGenerator to get a chat interface:

from dreadnode.generators.generator.transformers_ import TransformersGenerator

gen = TransformersGenerator.from_obj(hf_model, tokenizer)
chat = await gen.chat("Summarize this ticket: ...").run()
print(chat.last.content)

See dreadnode.generators for the full generator-construction API.

Feed an evaluation

Registry model artifacts are stored bytes, not inference endpoints. To run an evaluation against a published model, either serve it yourself (vLLM, Ray Serve, a managed endpoint) and pass the resulting model identifier to dn evaluation create --model ..., or load the weights locally and evaluate inline:

from dreadnode.evaluations import Evaluation
from dreadnode.generators.generator.transformers_ import TransformersGenerator

hf_model = model.to_hf(torch_dtype="bfloat16", device_map="auto")
tokenizer = model.tokenizer()
gen = TransformersGenerator.from_obj(hf_model, tokenizer)

async def task(prompt: str) -> str:
    chat = await gen.chat(prompt).run()
    return chat.last.content

evaluation = Evaluation(task=task, dataset=rows)

See Evaluations → Local for the SDK-side evaluation shape.

Properties worth knowing

model.name          # "acme/support-assistant"
model.version       # "1.2.0"
model.framework     # "safetensors"
model.task          # "text-generation" or None
model.architecture  # "LlamaForCausalLM" or None
model.files         # list of artifact paths inside the package
model.manifest      # ModelManifest (Pydantic)

All metadata reads, no network after the initial pull.

What to reach for next

Publish your own model → Publishing
Compare, annotate, promote → Versions & metrics
Browse models to pull → Catalog
Full SDK API → dreadnode.models