Using in code
Download a published model, load weights and tokenizer with LocalModel, and feed it into a generator or evaluation.
The SDK gives you two entry points to a published model: downloading the artifact into local storage, and loading the weights and tokenizer through LocalModel or HuggingFace.
| Goal | Use |
|---|---|
| Download a registry model so code can load it | dn.pull_package(["model://org/name:version"]) |
| Open a registry model already cached locally | Model("org/name", version=...) or dn.load_package("model://...") |
| Load a HuggingFace model into local storage | dn.load_model("meta-llama/Llama-3.1-8B-Instruct", task=...) |
| Publish a local source back to the registry | dn.push_model("./path") (see Publishing) |
Pull a published model
Section titled “Pull a published model”import dreadnode as dnfrom dreadnode.models import Model
dn.pull_package(["model://acme/support-assistant:1.2.0"])model = Model("acme/support-assistant", version="1.2.0")dn.load_package is the alternate entry point when the package is already local:
Both return a Model — the published-artifact handle. Its properties (name, version, framework, task, architecture, files) read from the manifest without further network calls.
Load weights and tokenizer
Section titled “Load weights and tokenizer”Model.to_hf() reconstructs the artifact directory on disk and hands it to HuggingFace from_pretrained:
hf_model = model.to_hf()tokenizer = model.tokenizer()Extra keyword arguments are forwarded. Common ones:
import torch
hf_model = model.to_hf( torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=False,)to_hf() dispatches to the right HF class based on task from the manifest (AutoModelForCausalLM, AutoModelForSequenceClassification, etc.). When task is missing, it falls back to AutoModel.
For raw filesystem access — serving with vLLM, converting the checkpoint, or running tools that expect a model directory — call model_path():
path = model.model_path()# model.safetensors# config.json# tokenizer.json# ...The directory is materialized on first access and reused on subsequent calls against the same object.
Load a HuggingFace model into local storage
Section titled “Load a HuggingFace model into local storage”import dreadnode as dn
local_model = dn.load_model("meta-llama/Llama-3.1-8B-Instruct", task="text-generation")hf_model = local_model.to_hf(torch_dtype="bfloat16", device_map="auto")load_model caches the HuggingFace download in Dreadnode storage. The first call downloads; subsequent calls read from disk. Pass model_name to override the local storage name.
To publish that cached model back to the Dreadnode registry, re-emit it as a directory with a model.yaml and push — see Publishing.
Run a generator against the loaded model
Section titled “Run a generator against the loaded model”Wrap the loaded weights in a TransformersGenerator to get a chat interface:
from dreadnode.generators.generator.transformers_ import TransformersGenerator
gen = TransformersGenerator.from_obj(hf_model, tokenizer)chat = await gen.chat("Summarize this ticket: ...").run()print(chat.last.content)See dreadnode.generators for the full generator-construction API.
Feed an evaluation
Section titled “Feed an evaluation”Registry model artifacts are stored bytes, not inference endpoints. To run an evaluation against a published model, either serve it yourself (vLLM, Ray Serve, a managed endpoint) and pass the resulting model identifier to dn evaluation create --model ..., or load the weights locally and evaluate inline:
from dreadnode.evaluations import Evaluationfrom dreadnode.generators.generator.transformers_ import TransformersGenerator
hf_model = model.to_hf(torch_dtype="bfloat16", device_map="auto")tokenizer = model.tokenizer()gen = TransformersGenerator.from_obj(hf_model, tokenizer)
async def task(prompt: str) -> str: chat = await gen.chat(prompt).run() return chat.last.content
evaluation = Evaluation(task=task, dataset=rows)See Evaluations → Local for the SDK-side evaluation shape.
Properties worth knowing
Section titled “Properties worth knowing”model.name # "acme/support-assistant"model.version # "1.2.0"model.framework # "safetensors"model.task # "text-generation" or Nonemodel.architecture # "LlamaForCausalLM" or Nonemodel.files # list of artifact paths inside the packagemodel.manifest # ModelManifest (Pydantic)All metadata reads, no network after the initial pull.
What to reach for next
Section titled “What to reach for next”- Publish your own model → Publishing
- Compare, annotate, promote → Versions & metrics
- Browse models to pull → Catalog
- Full SDK API →
dreadnode.models